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Reeulatorv DNA sequences of the gene for the human catalytic telomerase 
subunit, and their diagnostic and therapeutic use 



Structure and function of the chromosome ends 

5 

The genetic material of eukaryotic cells is distributed on linear chromosomes. The 
ends of hereditary units are termed telomeres, derived from the Greek words telos 
(end) and meros (part, segment). Most telomeres consist of repeats of short sequences 
which are mainly composed of thymine and guanine (Zakian, 1995). In all the 
10 vertebrates which have so far been investigated, the telomeres consist of the sequence 
TTAGGG (Meyne et aL, 1989). 

The telomeres have a variety of important functions. They prevent the fusion of 
chromosomes (McClintock, 1941) and thus the formation of dicentric hereditary 
15 units. Such chromosomes having two centromeres can lead to the development of 

cancer due to loss of heterozygosis or duplication, or loss of genes. 

In addition, telomeres serve the purpose of distinguishing intact hereditary units from 
damaged hereditary units. Thus, yeast cells ceased their cell division when they 
20 contained a chromosome without a telomere (Sandell and Zakian, 1993). 

Telomeres fulfil another important task in association with the replication of 
eukaryotic cell DNA. In contrast to the circular genomes of prokaryotes, the linear 
chromosomes of eukaryotes cannot be completely replicated by the DNA polymerase 

25 complex. RNA primers are required to initiate DNA replication. After elimination of 

the RNA primers, extension of the Okazaki fragments and subsequent ligation, the 
newly synthesized DNA strand lacks the 5' end since the RNA primer cannot be 
replaced by DNA at that point. Without special protective mechanisms, the 
chromosomes would therefore shrink with each cell division ("end-replication 

30 problem"; Harley et al, 1990). The non-coding telomere sequences presumably 

constitute a buffer zone for preventing the loss of genes (Sandell and Zakian, 1993). 
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In addition to this, telomeres also play an import role in regulating cell ageing 
(Olovnikov, 1973). Human somatic cells exhibit a limited capacity for replication in 
culture; after a certain period of time, they become senescent. In this state, the cells 
no longer divide even after having been stimulated with growth factors; however, 
5 they do not die and remain metabolically active (Goldstein, 1990). Various 
observations support the hypothesis that a cell determines how many more times it 
can divide on the basis of the length of its telomeres (Allsopp et ai, 1992). 

In summary, the telomeres consequently possess key functions in the ageing of cells, 
10 and in stabilizing the genetic material and preventing cancer. 

The enzyme telomerase synthesizes the telomeres 

As described above, organisms which possess linear chromosomes can only replicate 
15 their genome incompletely in the absence of a special protective mechanism. Most 

eukaryotes use a special enzyme, i.e. telomerase, for regenerating the telomere 
sequences. Telomerase is expressed constitutively in the single-cell organisms which 
have so far been investigated. On the other hand, telomerase activity has only been 
measured in humans in germ cells and tumour cells, whereas neighbouring somatic 
20 tissue did not contain any telomerase (Kim et aL, 1994). 

Telomerase can also be designated functionally as terminal telomere transferase, 
which is located in the cell nucleus as a multiprotein complex. While the RNA 
moiety of human telomerase has been known for a relatively long period of time 
25 (Feng et aL, 1995), the catalytic subunit of this enzyme group was recently identified 

in a variety of organisms (Lingner et ai, 1997; cf. our application PCT EP/98/03468 
which is likewise pending). These catalytic subunits of telomerase are strikingly 
homologous both among themselves and in relation to all previously known reverse 
transcriptases. 

30 

WO 98/14592 also describes nucleic acid and amino acid sequences of the catalytic 
telomerase subunit. 
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Activation of telomerase in human tumours 

It was originally only possible to demonstrate telomerase activity in humans in germ 
5 line cells and not in normal somatic cells (Hastie et a/., 1990; Kim et al. 9 1994). 
Following the development of a more sensitive detection method (Kim et al., 1994), 
a low telomerase activity was also detected in hematopoietic cells (Broccoli et al., 
1995; Counter et al. 7 1995; Hiyama et al., 1995). It is true, however, that these cells 
nevertheless exhibited a reduction in the telomeres (Vaziri et ai, 1994; Counter et 
10 a/., 1995). It has still not been resolved whether the quantity of enzyme in these cells 
is not sufficient for compensating the telomere loss or whether the telomerase activity 
which is measured stems from a subpopulation, e.g. incompletely differentiated 
CD34 + 38 + precursor cells (Hiyama et al^ 1995). In order to resolve this, it would be 
necessary to detect telomerase activity in a single cell. 

15 

Interestingly, however, significant telomerase activity was detected in a large number 
of the tumour tissues which had thus far been tested (1734/2031, 85%; Shay, 1997), 
whereas no activity was found in normal somatic tissue (1/196, <1%, Shay, 1997). In 
addition various investigations have shown that the telomeres still shrank in 

20 senescent cells which were transformed with viral oncoproteins and it was only 
possible to detect telomerase in the subpopulation which survived the growth crisis 
(Counter et aL, 1992). The telomeres were also stable in these immortalized cells. 
(Counter et ai, 1992). Similar findings from investigations in mice (Blasco et aL, 
1996) support the assumption that reactivation of the telomerase is a late event in 

25 tumorigenesis. 

Based on these results, a "telomerase hypothesis" was developed which links the loss 
of telomere sequences and cell ageing with telomerase activity and the development 
of cancer. In long-lived species such as humans, the shrinking of the telomeres can be 
30 regarded as being a mechanism for suppressing tumours. Differentiated cells which 

do not contain any telomerase cease their cell division at a particular telomere length. 
If such a cell mutates, it can only form a tumour if the cell can extend its telomeres. 
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Otherwise, the cell would continue to lose telomere sequences until its chromosomes 
became unstable and it was finally destroyed. Telomerase reactivation is presumably 
the main mechanism used by tumour cells to stabilize their telomeres. 

5 It follows from these observations and considerations that it should be possible to 
treat tumours by inhibiting the telomerase. Conventional cancer therapies using 
cytostatic agents or short-wave radiation damage all the dividing cells in the body in 
addition to the tumour cells. However, since only germ line cells, apart from tumour 
cells, contain significant telomerase activity, telomerase inhibitors would attack the 

10 tumour cells more specifically and consequently elicit fewer undesirable side effects. 

Telomerase activity has been detected in all the tumour tissues which have so far 
been tested, which means that these therapeutic agents could be employed against all 
types of cancer. The effect of telomerase inhibitors would then set in when the 
telomeres of the cells had shortened to such an extent that the genome became 

15 unstable. Since tumour cells usually possess telomeres which are shorter than those 
of normal somatic cells, cancer cells would be the first to be eliminated by the 
telomerase inhibitors. By contrast, cells possessing long telomeres, such as the germ 
cells, would only be damaged at a much later date. Telomerase inhibitors 
consequently represent a potential way forward in the treatment of cancer. 

20 

It becomes possible to obtain unambiguous answers to the question of the nature and 
points of attack of physiological telomerase inhibitors once the manner in which 
expression of the telomerase gene is regulated has also been identified. 

25 Regulation of gene expression in eukarvotes 

There are a large number of points in eukaryotic gene expression, i.e. the cellular 
flow of information from the DNA to the protein by way of the RNA, at which 
regulatory mechanisms can exert an effect. Examples of individual control steps are 
30 gene amplification, the recombination of gene loci, chromatin structure, DNA 

methylation, transcription, post-transcriptional modifications of mRNA, mRNA 
transport, translation and post-translational modifications of proteins. Studies which 
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have been carried out to date indicate that control at the level of transcription 
initiation is of the greatest importance (Latchman, 1991). 

A region which is responsible for regulating transcription, and which is designated 
5 the promoter region, is located directly upstream of the transcription start of a gene 
which is transcribed by RNA polymerase II. Comparison of the nucleotide sequences 
of promoter regions from a large number of known genes shows that particular 
sequence motifs occur regularly in this region. These elements include, inter alia, the 
TATA box, the CCAAT box and the GC box, which elements are recognized by 
10 specific proteins. The TATA box, which is located about 30 nucleotides upstream of 
the transcription start, is, for example, recognized by the TFIID subunit TBP ("TATA 
box-binding protein"), whereas particular GC-rich sequence segments are specifically 
bound by the transcription factor Spl ("specificity protein 1"). 

15 The promoter can be functionally subdivided into a regulatory segment and a 
constitutive segment (Latchman, 1991). The constitutive control region comprises the 
so-called core promoter which enables transcription to be initiated correctly. This 
promoter contains the sequence elements which are described as UPE's (upstream 
promoter elements) which are necessary for efficient transcription. The regulatory 

20 control segments, which can be interlaced with the UPE's, possess sequence elements 
which can be involved in the signal-dependent regulation of transcription by 
hormones, growth factors, etc. They impart tissue-specific or cell-specific promoter 
properties. 

25 DNA segments which are able to exert an influence on gene expression over 

relatively large distances are a characteristic feature of eukaryotic genes. These 
elements can be located upstream or downstream of a transcription unit, or within the 
unit, and can perform their function independently of their orientation. These 
sequence segments may reinforce (enhancers) or attenuate (silencers) promoter 

30 activity. In a similar way to the promoter regions, enhancers and silencers also 

accommodate several binding sites for transcription factors. 
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The invention relates to the DNA sequences from the 5 -flanking region of the gene 
for the catalytically active human telomerase subunit and intron sequences for this 
gene. 

5 The invention particularly relates to the 5 -flanking regulatory DNA sequence which 
contains the promoter DNA sequence for the gene for the human catalytic telomerase 
subunit, as depicted in Fig. 10 (SEQ ID NO 3). 

The invention furthermore relates to part regions of the 5 -flanking regulatory DNA 
10 sequence, as depicted in Fig. 4 (SEQ ID NO 1), which has a regulatory effect. 

Intron sequences for the gene for the human catalytic telomerase subunit, in 
particular those sequences which have a regulatory effect, are also part of the subject- 
matter of the present invention. The intron sequences according to the invention are 
15 described in detail in the context of Example 5 (cf. SEQ ID NO 4, 5, 6, 7, 8, 9, 10, 
11, 12, 13, 14, 15, 16, 17, 18, 19 and 20). 

The invention furthermore relates to a recombinant construct which comprises the 
DNA sequences according to the invention, in particular the 5-flanking DNA 
20 sequence of the gene for the human catalytic telomerase subunit, or part regions 
thereof 

Preference is given to recombinant constructs which, in addition to the DNA 
sequences according to the invention, in particular the 5'-flanking DNA sequence of 
25 the gene for the human catalytic telomerase subunit, or part regions thereof, also 
contain one or more additional DNA sequences which encode polypeptides or 
proteins. 

According to a particularly preferred embodiment, these additional DNA sequences 
30 encode antineoplastic proteins. 
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Particular preference is given to those antineoplastic proteins which inhibit 
angiogenesis directly or indirectly. Examples of these proteins are: 

Plasminogen activator inhibitor (PAI-1), PAI-2, PAI-3, angiostatin, endostatin, 
5 platelet factor 4, TIMP-1, TIMP-2, TIMP-3 and leukaemia inhibitory factor (LIF). 

Antineoplastic proteins which have a direct or indirect cytostatic effect on tumours 
are likewise particularly preferred. These proteins include, in particular: 

10 perforin, granzyme, IL-2, IL-4, IL-12, interferons, such as IFN-a, IFN-6 and 
IFN-y, TNF, TNF-a, TNF-B, oncostatin M; tumour suppressor genes, such as p53, 
retinoblastoma. 

Particular preference is furthermore given to antineoplastic proteins which, where 
15 appropriate in addition to their antineoplastic effect, stimulate inflammations and 

thereby contribute to the elimination of tumour cells. Examples of these proteins are: 

RANTES, monocyte chemotactic and activating factor (MCAF), IL-8, macrophage 
inflammatory protein (MIP-la,-B), neutrophil activating protein-2 (NAP-2), IL-3, IL- 
20 5, human leukaemia inhibitory factor (LIF), IL-7, IL-1 1, IL-13, GM-CSF, G-CSF and 

M-CSF. 

Particular preference is furthermore given to antineoplastic proteins which, due to 
their action as enzymes, are able to convert precursors of an antineoplastic active 
25 compound into an antineoplastic active compound. Examples of these enzymes are: 

herpes simplex virus thymidine kinase, varicella zoster virus thymidine kinase, 
bacterial nitroreductase, bacterial B-glucuronidase, plant B-glucuronidase from Secale 
cereale, human glucuronidase, human carboxypeptidase, bacterial carboxypeptidase, 
30 bacterial B-lactamase, bacterial cytosine deaminidase, human catalase and/or 

phosphatase, human alkaline phosphatase, type 5 acid phosphatase, human 
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lysooxidase, human acid D-aminooxidase, human glutathione peroxidase, human 
eosinophil peroxidase and human thyroid peroxidase. 

The abovementioned recombinant constructs can also contain DNA sequences which 
5 encode factor VIII or factor EX, or part fragments thereof These DNA sequences also 
include other blood clotting factors. 

The abovementioned recombinant constructs can also contain DNA sequences which 
encode a reporter protein. Examples of these reporter proteins are: 

10 

Chloramphenicol acetyl transferase (CAT), glow-worm luciferase (LUC), G-galac- 
tosidase (B-Gal), secreted alkaline phosphatase (SEAP), human growth hormone 
(hGH), B-glucuronidase (GUS), green-fluorescing protein (GFP), and all the variants 
derived therefrom, aquarin and obelin. 

15 

Recombinant constructs according to the invention can also contain DNA which 
encodes the human catalytic telomerase subunit and its variants and fragments in the 
antisense orientation. Where appropriate, these constructs can also contain other 
protein subunits of the human telomerase and the telomerase RNA component in the 
20 antisense orientation. 

The recombinant constructs can, in addition to the DNA which encodes the human 
catalytic telomerase subunit, and its variants and fragments, also contain other 
protein subunits of the human telomerase and the telomerase RNA component. 

25 

The invention furthermore relates to a vector which contains the abovementioned 
DNA sequences according to the invention, in particular the 5'-flanking DNA 
sequences and also one or more of the other DNA sequences mentioned above. 

30 The preferred vector for these constructs is a virus, for example a retrovirus, an 
adenovirus, an adeno-associated virus, a herpes simplex virus, a vaccina virus, a 
lentiviral virus, a Sindbis virus and a Semliki forest virus. 
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Preference is also given to using plasmids as vectors. 

The invention furthermore relates to pharmaceutical preparations which comprise 
5 recombinant constructs or vectors according to the invention; for example a 
preparation in a colloidal dispersion system. 

Examples of suitable colloidal dispersion systems are liposomes or polylysine 
ligands. 

10 

The preparations of the constructs or vectors according to the invention in colloidal 
dispersion systems can be supplemented with a ligand which binds to the membrane 
structures of tumour cells. Such a ligand can, for example, be attached to the 
construct or the vector or else be a component of the liposome structure. 

15 

Suitable ligands are, in particular, polyclonal or monoclonal antibodies, or antibody 
fragments thereof, which bind, by their variable domains, to the membrane structures 
of tumour cells, or substances carrying mannose terminally, cytokines or growth 
factors, or fragments or part sequences thereof, which bind to receptors on tumour 
20 cells. 

Examples of corresponding membrane structures are receptors for a cytokine or a 
growth factor, such as IL-1, EGF, PDGF, VEGF, TGF B, insulin or insulin-like 
growth factor (ILGF), or adhesion molecules, such as SLeX, LFA-1, MAC-1, 
25 LECAM-1 or VLA-4, or the mannose-6-phosphate receptor. 

The present invention includes pharmaceutical preparations which, in addition to the 
vector constructs according to the invention, can also comprise non-toxic, inert, 
pharmaceutically suitable excipients. It is possible to conceive of administering (e.g. 
30 intravenously, intraarterial ly, intramuscularly, subcutaneously, intradermally, anally, 

vaginally, nasally, transdermal ly, intraperitoneally, as an aerosol or orally) these 
preparations at the site of a tumour or administering them systemically. 
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The vector constructs according to the invention can be employed in gene therapy. 

The invention furthermore relates to a recombinant host cell, in particular a 
5 recombinant eukaryotic host cell, which harbours the above-described constructs or 
vectors. 

The invention furthermore relates to a process for identifying substances which affect 
the promoter activity, silencer activity or enhancer activity of the catalytic telomerase 
10 subunit, with this process comprising the following steps: 

A. adding a candidate substance to a host cell which harbours the regulatory 
DNA sequence according to the invention, in particular the 5'-flanking 
regulatory DNA sequence for the gene for the human catalytic telomerase 

15 subunit, or a part region thereof which has a regulatory effect, which sequence 

or part region is functionally linked to a reporter gene, and 

B. measuring the effect of the substance on expression of the reporter gene. 

20 The process can be employed for identifying substances which increase the promoter 

activity, silencer activity or enhancer activity of the catalytic telomerase subunit. 

The process can furthermore be employed for identifying substances which inhibit 
the promoter activity, silencer activity or enhancer activator of the catalytic 
25 telomerase subunit. 

The invention furthermore relates to a process for identifying factors which bind 
specifically to fragments of the DNA fragments according to the invention, in 
particular the 5 ? -flanking regulatory DNA sequence of the catalytic telomerase 
30 subunit. This method comprises screening an expression cDNA library using the 

above-described DNA sequence, or subfragments of widely differing length, as the 
probe. 
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The above-described constructs or vectors can also be used for preparing transgenic 
animals. 

5 The invention furthermore relates to a process for detecting telomerase-associated 
conditions in a patient, which process comprises the following steps: 

A. incubating a construct or vector, which contains the DNA sequence according 
to the invention, in particular the 5 f -flanking regulatory DNA sequence for the 

10 gene for the human catalytic telomerase subunit, or a part region thereof 

having a regulatory effect, and a reporter gene, with body fluids or cell 
samples, 

B. detecting the activity of the reporter gene in order to obtain a diagnostic value; 
15 and 

C. comparing the diagnosic value with standard values for the reporter gene 
construct in standardized normal cells or body fluids of the same type as the 
test sample; 

20 

The detection of diagnostic values which are higher or lower than the standard 
comparative values indicates a telomerase-associated condition, which in turn 
indicates a pathogenic condition. 

25 Explanation of the figures: 

Fig. 1: Southern blot analysis using genomic DNA from various species 

A: Photograph of an ethidium bromide-stained 0.7% agarose gel 
30 containing approximately 4 jig of Eco Rl-cut genomic DNA. Track 1 

contains Hind Ill-cut X DNA as size markers (23.5, 9.4, 6.7, 4.4, 2.3, 2.0 
and 0.6 kb). Tracks 2 to 10 contain human, rhesus monkey, Sprague 
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Dawley rat, BALB/c mouse, dog, bovine, rabbit, chicken and yeast 
{Saccharomyces cerevisiae) genomic DNA. 



B: Autoradiogram, corresponding to Fig.l A, of a Southern blot analysis 
5 in which radioactively labelled hTC-cDNA probe of about 720 bp in 

length is used for the hybridization. 



Fig. 2: Restriction analysis of the recombinant X DNA of the phage clone PI 2, 
which hybridizes with a probe from the 5* region of the hTC cDNA. 

10 

The figure shows a photograph of an ethidium bromide-stained 0.4% 
agarose gel. Tracks 1 and 2 contain Eco RI/Hind Ill-cut X DNA and a 
1 kb ladder from Gibco as size markers. Tracks 3-7 each contain 250 ng 
of the DNA from the recombinant phage which has been cut with Bam 
15 HI (track 3), Eco RI (track 4), Sal I (track 5), Xho I (track 6) and Sac I 

(track 7). The arrows mark the two X arms of the vector EMBL3 Sp6/T7. 



Fig. 3: Restriction analysis and Southern blot analysis of the recombinant 
X DNA of the phage clone which hybridizes with a probe from the 5 1 
20 region of the hTC cDN A. 



A: The figure shows a photograph of an ethidium bromide-stained 0.8% 
agarose gel. Tracks 1 and 15 contain a 1 kb ladder from Gibco as size 
markers. Tracks 2 to 14 each contain 250 ng of cut X DNA from the 
25 recombinant phage clone. The following enzymes were employed: track 

2: Sac 1, track 3: Xho I, track 4: Xho I, Xba I, track 5: Sac I, Xho I, track 
6: Sal I, Xho I, Xba I, track 7: Sac 1, Xho I, Xba I, track 8: Sac I, Sal I, 
Xba I, track 9: Sac I, Sal I, BamH I, track 10: Sac I, Sal I, Xho I, track 1 1 : 
Not I, track 12: Sma L track 13: empty, track 14: not digested. 
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B: Autoradiogram, corresponding to Fig. 3 A, of a Southern blot analysis. 
A 5 ! -hTC cDNA fragment of about 420 bp in length was used as the 
probe for the hybridization. 

5 Fig. 4: Partial DNA sequence of the S'-flanking region and of the promoter of 
the gene for the human catalytic telomerase subunit. The ATG start 
codon in the sequence is printed in bold. The depicted sequence 
corresponds to SEQ ID NO 1. 

10 Fig. 5: Use of primer extension analysis to identify the transcription start. 

The figure shows an autoradiogram of a denaturing polyacrylamide gel 
which was selected for depicting a primer extension analysis. An 
oligonucleotide having the sequence 

1 5 5 GTTA AGTTGTAGCTTAC ACTGGTTCTC 3 ' was used as the primer. 

The primer extension reaction was loaded in track 1. Tracks G, A, T and 
C constitute the sequence reactions using the same primer and the 
corresponding dideoxynucleotides. The thick arrow marks the main 
transcription start while the thin arrows point to three subsidiary 

20 transcription start points. 

Fig. 6: cDNA sequence of the human catalytic telomerase subunit (hTC; cf. our 
pending application PCT/EP/98/03468). The depicted sequence 
corresponds to SEQ ID NO 2. 

25 

Fig. 7: Structural organization and restriction map of the human hTC gene and 
its 5 '-flanking and 3 '-flanking regions. 

Exons are shown as consecutively numbered rectangles which are filled- 
30 in in black, and introns are shown as regions which are not filled in. 

Untranslated sequence segments in the exons are hatched. Translation 
starts in exon 1 and ends in exon 16. Restriction enzyme cleavage sites 
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are marked as follows: S, SacI; X, XhoL The relative arrangement of the 
five phage clones (P2, P3, P5, PI 2, PI 7), and of the product from the 
genome walking, are shown by thin lines. As the dots indicate, the 
sequence of intron 16 has only been partly deciphered. 

5 

Fig. 8: HTL splice variants. 

A: Diagrammatic structure of the hTC mRNA splice variants. The 
complete hTC mRNA is depicted as a rectangle with a grey background 

10 in the upper region of the figure. The 16 exons are depicted in accordance 

with their size. The translation start (ATG) and the stop codon, and also 
the telomerase-specific T motif, and the seven RT motifs, are all shown. 
The hTC variants are subdivided into deletion and insertion variants. The 
missing exon sequences are marked in the deletions. The insertions are 

15 shown by additional white rectangles. The sizes and origins of the 

inserted sequences are given. Newly formed stop codons are marked. The 
size of the insertion in variant ENS2 is unknown. 

B: Exon-intron transitions in the hTC splice variants. Unspliced 5'- 
20 flanking and 3 '-flanking sequences are shown as white rectangles. The 

origins of the exon and intron sequences are given. Intron and exon 
sequences are shown in small letters and large letters, respectively. The 
donor and acceptor sequences in the splice sites are underlaid as grey 
rectangles, and their exon and intron origins are also given. 

25 

Fig. 9: Identification of the transcription start by means of RT-PCR analysis. 

The RT-PCR was carried out using a cDNA library prepared from HL 60 
cells and genomic DNA as the positive control. A common 3' primer 
hybridizes to a region of the exon 1 sequence. The positions of the 
30 different 5' primers in the coding region or the 5 f -flanking region are 

given. In the negative control, no template DNA was added to the PCR 
reaction. M: DNA size marker. 
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Fig. 10: Nucleotide sequence and structural features of the hTC promoter. 

The figure depicts 11273 bp of the 5 '-flanking hTC gene sequence, 
beginning with the translation start codon ATG (+1). The putative region 
5 of the translation start is underlined. Possible regulatory sequence 

segments within the 4000 bp upstream of the translation start are ringed. 
The depicted sequence corresponds to SEQ ID NO 3. 

Fig. 1 1 : Activity of the hTC promoter in HEK-293 cells. 

10 The first 5000 bp of the 5 ? -flanking hTC gene region are shown 

diagrammatically in the upper part of the figure. The ATG start codon is 
picked out. CpG-rich islands are marked by grey rectangles. The sizes of 
the hTC promoter-luciferase construct are shown on the left-hand side of 
the figure. The promoterless pGL2 basic construct and the SV40 

15 promoter construct pGL2-Pro were used as controls in each transfection. 

The relative luciferase activities of the different promoter constructs in 
HEK cells are shown as continuous bars on the right-hand side of the 
figure. The standard deviation is indicated. The numerical values 
represent the average of two independent experiments which were carried 

20 out in duplicate. 

Tab. 1 : Exon-intron transitions in the hTC gene 

The table lists the nucleotide sequences at the 3' and 5' splice transitions 
of the hTC gene. The consensus sequences for donor and acceptor 
25 sequences (AG and GT) are underlaid with grey rectangles. The table 

shows the intron sequences (small letters) and exon sequences (large 
letters) which flank the splice acceptor and donor sites. The sizes of the 
exons and introns are given in bp. 



30 Tab. 2: Potential binding sites for DNA-binding factors in the nucleotide 

sequence of intron 2 
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The search for possible DNA-binding factors (e.g. transcription factors) 
was carried out using the "find pattern" algorithm from the Genetics 
Computer Group (Madison, USA) GCG sequence analysis program 
package. The table lists the abbreviations of the DNA-binding factors 
5 which were identified and their location in intron 2. 
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Tab. 2 



Factors 


Location in intron 2 


C/EBP 


2925 


CRE.2 


2749 


Spl 


2378, 4094, 4526, 4787, 4835, 4995 


AP-2 CS3 


5099 


AP-2 CS4 


2213, 3699, 4667, 5878, 5938, 6059, 6180, 6496 


AP-2 CS5 


5350, 5798, 5880, 5940, 6061, 6182, 6375, 6498 


PEA3 


934, 2505 


P53 


2125 


GR uteroglobin 

. , . w . .... 


848, 1487, 2956 


PR uteroglobin 


3331 


Zeste-white 


1577, 1619, 1703, 1745, 1787, 1829, 1871, 1913, 1955, 
1997, 2039, 2081, 3518, 3709, 4765, 5014, 5055 


GRE 


846 


MyoD-MCK right 
site/rev 


447, 509, 558, 1370, 1595, 1900, 2028, 2099, 4557 


MyoD-MCK left site 


108, 118, 453, 1566, 1608, 1692, 1734, 1818, 1902, 
1986, 2372, 2460, 2720, 3491, 5030 


Ets-1 CS 


6408 


API 


3784, 4406 


CREB 


2801 


GATA-1 


839, 1390, 3154 


c-Myc 


108, 118, 453, 1566, 1608, 1692, 1734, 1818, 1902, 
1986, 2372, 2460, 2720, 3491, 5030 


CACCC site 


991 


CCAAT site 


1224 


CCAC box 


992 


CAAT site 


463, 2395 


Rb site 


992, 4663 


TATA 


3650 


CDEI 


106, 1564, 1606, 1690, 1732, 1816, 1900, 1984 
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Examples 

The human gene for the catalytic telomerase subunit (ghTC), and the regions of this 
gene located 5' and 3', were cloned, while the start point for transcription was 
5 determined, potential binding sites for DNA-binding proteins were identified and 
active promoter fragments were highlighted. The sequence of the hTC cDNA (Fig. 6) 
has already been reported in our application PCT/EP/98/03468, which is also 
pending. Unless otherwise mentioned, all the data refer to the position of the cDNA 
in this sequence. 

10 

Example 1 

A genomic Southern blot analysis was used to determine whether ghTC constitutes a 
single gene in the human genome or whether there exist several loci for the hTC gene 
1 5 and possibly also ghTC pseudogenes. 

In order to do this, a commercially available zoo blot from Clontech was subjected to 
Southern blot analysis. This blot contains 4 jag of Eco Rl-cut genomic DNA from 
nine different species (human, monkey, rat, mouse, dog, bovine, rabbit, chicken and 

20 yeast). With the exception of yeast, chicken and human, the DNA was isolated from 
kidney tissue. The human genomic DNA was isolated from placenta and the chicken 
genomic DNA was purified from liver tissue. An hTC cDNA fragment of about 720 
bp in length, which was isolated from hTC cDNA, variant Del2 (position 1685 to 
2349 plus 2531 to 2590 in Fig. 6 [deletion 2; cf. Example 5 in Fig. 8]), was used as 

25 the radioactively labelled probe in the autoradiogram in Fig. 1. The experimental 

conditions for the blot hybridization and washing steps were taken from Ausubel et 
ai (1987). 

Ln the case of the human DNA, the probe recognizes two specific DNA fragments. 
30 The smaller Eco RI fragment, of from about 1.5 to 1.8 kb in length, probably 
originates from two Eco RI cleavage sites in an intron in the ghTC DNA. On the 
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basis of this result, it is to be assumed that only one single ghTC gene is present in 
the human genome. 

Example 2 

5 

In order to isolate the 5' flanking hTC gene sequence, approx. 1.5 x 10 6 phages from 
a human genomic placenta gene library (EMBL 3 SP6/T7 from Clontech, order 
number HL1067J) were hybridized on nitrocellulose filters (0.45 [im; from 
Schleicher and Schuell), in accordance with the manufacturer's instructions, with a 

10 radioactively labelled 5 '-hTC cDNA fragment of about 500 bp in length (position 
839 to 1345 in Fig. 6). The nitrocellulose filters were firstly incubated, at 42°C for 
two hours, in 2 x SSC (0.3 M NaCl; 0.5 M Tris-HCl, pH 8.0) and then in a 
prehybridization solution (50% formamide; 5 x SSPE, pH 7.4; 5 x Denhard's 
solution; 0.25% SDS; 100 \ig of herring sperm DNA/ml). For the overnight 

15 hybridization, the prehybridization solution was supplemented with 1.5 x 10 6 cpm of 
denatured, radioactively labelled probe/ml of solution. Nonspecifically bound 
radioactive DNA was removed under stringent conditions, i.e. by means of three five- 
minute steps of washing with 2 x SSC; 0.1% SDS at from 55 to 65°C. The filters 
were evaluated by autoradiography. 

20 

The phage clones which were identified in this primary investigation were purified 
(Ausubel et ai (1987)). In subsequent analyses, one phage clone, i.e. P12 turned out 
to be potentially positive. A X DNA preparation carried out on this phage (Ausubel et 
ai (1987)), and the subsequent restriction digestion with enzymes which release the 
25 genomic insert in fragments, showed that this phage clone contains an insert of 

approx. 15 kb in the vector (Fig. 2). 

In order to isolate the complete hTC gene sequence, in each case from 1 to 1.5 x 10 6 
phages were screened, in independent experiments, with in each case different 
30 radioactively labelled probes, as described above. 
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The phage clones which were identified in these primary investigations, and which 
were positive for the corresponding probes, were purified. The phage clone P17 was 
found to contain an hTC cDNA fragment of about 250 bp in length (position 1787 to 
2040 in Fig. 6). The phage clone P2 was identified as containing an hTC cDNA 
5 fragment of about 740 bp in length (position 1685 to 2349 plus 2531 to 2607 in Fig. 

6 [deletion 2; cf. Example 5]). The phage clones P3 and P5 were found to contain a 
3' hTC cDNA fragment of 420 bp in length (position 3047 to 3470 in Fig. 6). After 
the X DNA had been prepared from these phages, and subsequently subjected to 
restriction digestion with enzymes which release the genomic insert in fragments, the 
10 inserts were subcloned into plasmids (Example 4). 

Example 3 

In order to investigate whether the 5 ' end of the hTC cDN A was also present in the 
15 insert in the recombinant phage clone PI 2, the X DNA from this clone was 

hybridized, in a Southern blot analysis, with a radiactively labelled hTC cDNA 
fragment of about 440 bp in length (position 1 to 440 in Fig. 6) from the extreme 5' 
region (Fig. 3). 

20 Since the isolated X DNA from the positive clone also hybridizes with the extreme 5' 

end of the hTC cDNA, this phage probably also contains the 5' sequence region 
flanking the ATG start codon. 

Example 4 

25 

In order to subclone the entire 15 kb insert in the positive phage clone P12 in the 
form of subfragments, and subsequently to sequence these fragments, restriction 
endonucleases which, on the one hand, release the entire insert from EMBL3 Sp6/T7 
(cf. Example 2) and, in addition, cut within the insert, were selected for digesting the 
30 DNA. 
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In all, two Xho I subfragments, of about 8.3 and about 6.5 kb in length, respectively, 
and three Sac I subfragments, of about 8.5, about 3.5 and about 3 kb in length, 
respectively, were subcloned into the pBluescript KS(+) vector (from Stratagene). 
The 5123 bp 5 '-flanking nucleotide sequence of the ghTC gene region, starting from 
5 the ATG start codon, was determined by analysing the sequences of these fragments 
(Fig. 4; corresponding to SEQ ID NO 1). Fig. 4 depicts the first 5123 bp (starting 
from the ATG start codon). Fig. 10 depicts the entire cloned 5' sequence 
(corresponding to SEQ ID NO 3). 

10 In order to subclone the entire insert, of approx. 14.6 kb in size, in phage clone P17 
in the form of subfragments, restriction endonucleases which, on the one hand, 
release the entire insert from EMLB3 Sp6/T7 and, in addition, cut a few times within 
the insert, were selected for digesting the DNA. Three XhoI/BamHI fragments, of 7.1 
kb, 4.2 kb and 1.5 kb in size, respectively, and one BamHI fragment, of 1.8 kb in 

15 size, were subcloned by means of using a combination digestion with the enzymes 

Xhol and BamHI. Combination restriction digestion with the enzymes Xhol and Xbal 
resulted in a Xhol/Xbal fragment of 6.5 kb in size, and two Xhol fragments, of 6.5 kb 
and 1.5 kb in size, respectively, being cloned. 

20 Digestion with the restriction enzyme Xhol was used to subclone the insert, of 
approx. 17.9 kb in size, in phage clone P2 in the form of subfragments. In all, three 
Xhol subfragments, of 7.5 kb, 6.4 kb and 1.6 kb in length, respectively, were cloned. 
Four Sad fragments, of 4.8 kb, 3 kb, 2 kb and 1.8 kb in size, respectively, were 
additionally subcloned by digesting with the restriction enzyme Sacl. 

25 

The insert, of approx. 13.5 kb in size, in phage clone P3 was subcloned by digesting 
with the restriction enzymes Sacl and/or Xhol. Six Sacl subfragments, of 3.2 kb, 
2 kb, 0.9 kb, 0.8 kb, 0.65 kb and 0.5 kb in length, respectively, and two Xhol 
subfragments, of 6.5 kb and 4.3 kb in length, respectively, were obtained in this 
30 connection. 
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The insert, of approx. 13.2 kb in size, in phage clone P5 was subcloned by digesting 
with the restriction enzymes SacI and/or XhoL In all, SacI fragments of 6.5 kb, 3.3 
kb, 3.2 kb, 0.8 kb and 0.3 kb in size, and Xhol fragmente of 7 kb and 3.2 kb in size, 
were subcloned. 

5 

In order to clone the hTC genomic sequence region located 3 4 of phage clone PI 7 
and 5' of phage clone P2, 3 genomic walkings were carried out using the Clontech 
GenomeWalker™ kits (catalogue number K1803-1) and various combinations of 
primers. In a final volume of 50 fil, 10 pmol of dNTP mix were added to 1 |il of 

10 human GenomeWalker Library HDL (from Clontech), and a PCR reaction was 
carried out in lxKlen Taq PCR reaction buffer and lxAdvantage Klen Taq 
polymerase mix (from Clontech). 10 pmol of an internal gene-specific primer, and 10 
pmol of the adaptor primer API (5'-GTAATACGACTCACTATAGGGC-3'; from 
Clontech) were added as primers. The PCR was carried out in 3 steps as a touchdown 

15 PCR. First of all, denaturation was carried out at 94°C for 20 sec, and the primers 

were then annealed, and the DNA chain extended, at 72°C for 4 min, over 7 cycles. 
There then followed 37 cycles in which the DNA was denaturated at 94°C for 20 sec 
but the subsequent primer extension took place at 67°C for 4 min. In conclusion, 
there followed a chain extension at 67°C for 4 min. After this first PCR, the PCR 

20 product was diluted 1:50. One |al of this dilution was used in a second nested PCR 
together with 10 pmol of dNTP mix in lxKlen Taq PCR reaction buffer and 
lxAdvantage Klen Taq polymerase mix and also 10 pmol of a nested gene-specific 
primer and 10 pmol of the nested Marathon Adaptor primers AP2 (5'- 
ACTAT AGGGC ACGCGTGGT-3 ' ; from Clontech). The PCR conditions 

25 corresponded to the parameters which were selected in the first PCR. As the sole 

exception, only 5 cycles rather than 7 cycles were selected in the first PCR step and 
only 24 cycles, instead of 37 cycles, were run in the second PCR step. The products 
of this nested genomic walking PCR were cloned into the TA Cloning Vector pCRII 
from InVitrogen. 



30 



Le A 32 Countries 



-24- 

In the first genomic walking, the gene-specific primer C3K2-GSP1 (5'- 
GACGTGGCTCTTGAAGGCCTTG-3 ') and the nested gene-specific primer C3K2- 
GSP2 (5'-GCCTTCTGGACCACGGCATACC-3') were used, together with the 
HDL library 4, and a PCR fragment of 1639 bp in length was obtained. In the second 
5 genomic walking, a PCR fragment of 685 bp in length was amplified from the HDL 
library 4 using the gene-specific primer C3F2 (5'- 
CGTAGTTGAGCACGCTGAACAGTG-3 ') and the nested gene-specific primer 
C3F (5'-CCTTCACCCTCGAGGTGAGACGCT-3. The third genomic walking 
mixture, using the gene-specific primer DEL5-GSP1 (5'- 
10 GGTGGATGTGACGGGCGCGTACG-3 4 ) and the nested gene-specific primer 
C5K-GSP1 (5'-GGTATGCCGTGGTCCAGAAGGC-3'), led to a 924 bp PCR 
fragments being cloned from the HDL library 1. In all, 2100 bp of the genomic hTC 
region located 3' of phage clone PI 7 were identified using this genomic walking 
method (see Fig. 7). 

15 

The subcloned fragments, and the genomic walking products, were sequenced in 
single-stranded form. The Lasergene Biocomputing Software (DNASTAR Inc. 
Madison, Wisconsin, USA) was used to identify overlapping regions and form 
contigs. In all, 2 large contigs were assembled from the sequences collected from 

20 phage clones PI 2, PI 7, P2, P3 and P5, and also the sequence data from the genomic 
walking. Contig 1 consists of sequence data from phage clones P12 and P17 and the 
sequence data from the genomic walking. Contig 2 was put together from the 
sequences from phage clones P2, P3 and P5. Overlapping phage clone regions are 
shown diagrammaticaly in Fig. 7. The sequence data from the 2 contigs are shown 

25 below. The ATG start codon in contig 1 is underlined. The TGA stop codon is 

underlined in contig 2. 
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Contigl: 



ACTTGAGCCC AAGAGTTCAA GGCTACGGTG AGCCATGATT GCAACACCAC ACGCCAGCCT TGGTGACAGA 70 

ATGAGACCCT GTCTCAAAAA AAAAAAAAAA AATTGAAATA ATATAAAGCA TCTTCTCTGG CCACAGTGGA 140 

ACAAAACCAG AAATCAACAA CAAGAGGAAT TTTGAAAACT ATACAAACAC ATGAAAATTA AACAATATAC 210 

TTCTGAATGA CCAGTGAGTC AATGAAGAAA TTAAAAAGGA AATTGAAAAA TTTATTTAAG CAAATGATAA 280 

CGGAAACATA ACCTCTCAAA ACCCACGGTA TACAGCAAAA GCAGTGCTAA GAAGGAAGTT TATAGCTATA 3 50 

AGCAGCTACA TCAAAAAAGT AGAAAAGCCA GGCGCAGTGG CTCATGCCTG TAATCCCAGC ACTTTGGGAG 420 

GCCAAGGCGG GCAGATCGCC TGAGGTCAGG AGTTCGAGAC CAGCCTGACC AACACAGAGA AACCTTGTCG 4 90 

CTACTAAAAA TACAAAATTA GCTGGGCATG GTGGCACATG CCTGTAATCC CAGCTACTCG GGAGGCTGAG 560 

GCAGGATAAC CGCTTGAACC CAGGAGGTGG AGGTTGCGGT GAGCCGGGAT TGCGCCATTG GACTCCAGCC 630 

TGGGTAACAA GAGTGAAACC CTGTCTCAAG AAAAAAAAAA AAGTAGAAAA ACTTAAAAAT ACAACCTAAT 700 

GATGCACCTT AAAGAAC TAG AAAAGCAAGA GCAAACTAAA CCTAAAATTG GTAAAAGAAA AGAAATAATA 770 

AAGATCAGAG CAGAAATAAA TGAAACTGAA AGATAACAAT ACAAAAGATC AACAAAATTA AAAGTTGGTT 840 

TTTTGAAAAG ATAAACAAAA TTGACAAACC TTTGCCCAGA CTAAGAAAAA AGGAAAGAAG ACCTAAATAA 910 

ATAAAGTCAG AGATGAAAAA AGAGACATTA CAAC.TGATAC CACAGAAATT CAAAGGATCA . CTAGAGGCTA 98 0 

CTATGAGCAA CTGTACACTA ATAAATTGAA AAACCTAGAA AAAATAGATA AATTCCTAGA TGCATACAAC 1050 

CTACCAAGAT TGAACCATGA AGAAATCCAA AGCCCAAACA GACCAATAAC AATAATGGGA TTAAAGCCAT 1120 

AATAAAAAGT CTCCTAGCAA AGAGAAGCCC AGGACCCAAT GGCTTCCCTG CTGGATTTTA CCAATCATTT 1190 

AAAGAAGAAT GAATTCCAAT CCTACTCAAA CTATTCTGAA AAATAGAGGA AAGAATACTT CCAAACTCAT 1260 

TCTACATGGC CAGTATTACC CTGATTCCAA AACCAGACAA AAACACATCA AAAACAAACA AACAAAAAAA 1330 

CAGAAAGAAA GAAAACTACA GGCCAATATC CCTGATGAAT ACTGATACAA AAATCCTCAA CAAAACACTA 14 00 

GCAAACCAAA TTAAACAACA CCTTCGAAAG ATCATTCATT GTGATCAAGT GGGATTTATT CCAGGGATGG 14 70 

AAGGATGGTT CAACATATGC AAATCAATCA ATGTGATACA TCATCCCAAC AAAATGAAGT ACAAAAACTA 154 0 

TATGATTATT TCACTTTATG CAGAAAAAGC ATTTGATAAA ATTCTGCACC CTTCATGATA AAAACCCTCA 1610 

AAAAACCAGG TATACAAGAA ACATACAGGC CAGGCACAGT GGCTCACACC TGCGATCCCA GCACTCTGGG 168 0 

AGGCCAAGGT GGGATGATTG CTTGGGCCCA GGAGTTTGAG ACTAGCCTGG GCAACAAAAT GAGACCTGGT 17 50 

CTACAAAAAA CTTTTTTAAA AAATTAGCCA GGCATGATGG CATATGCCTG TAGTCCCAGC TAGTCTGGAG 1820 

GCTGAGGTGG GAGAATCACT TAAGCCTAGG AGGTCGAGGC TGCAGTGAGC CATGAACATG TCACTGTACT 18 90 

CCAGCCTAGA CAACAGAACA AGACCCCACT GAATAAGAAG AAGGAGAAGG AGAAGGGAGA AGGGAGGGAG 1960 

AAGGGAGGAG GAGGAGAAGG AGGAGGTGGA GGAGAAGTGG AAGGGGAAGG GGAAGGGAAA GAGGAAGAAG 2030 

AAGAAACATA TTTCAACATA ATAAAAGCCC TATATGACAG ACCGAGGTAG TATTATGAGG AAAAACTGAA 2100 

AGCCTTTCCT CTAAGATCTG GAAAATGACA AGGGCCCACT TTCACCACTG TGATTCAACA TAGTACTAGA 217 0 

AGTCCTAGCT AGAGCAATCA GATAAGAGAA AGAAATAAAA GGCATCCAAA CTGGAAAGGA AGAAGTCAAA 22 4 0 

TTATCCTGTT TGCAGATGAT ATGATCTTAT ATCTGGAAAA GACTTAAGAC ACCACTAAAA AACTATTAGA 2 310 

GCTGAAATTT GGTACAGCAG GATACAAAAT CAATGTACAA AAATCAGTAG TATTTCTATA TTCCAACAGC 2 380 

AAACAATCTG AAAAAGAAAC CAAAAAAGCA GCTACAAATA AAATTAAACA GCTAGGAATT AACCAAAGAA 2 4 50 

GTGAAAGATC TCTACAATGA AAACTATAAA ATGTTGATAA AAGAAATTGA AGAGGGCACA AAAAAAGAAA 2 520 

AGATATTCCA TGTTCATAGA TTGGAAGAAT AAATACTGTT AAAATGTCCA TACTACCCAA AGCAATTTAC 2590 

AAATTCAATG CAATCCCTAT TAAAATACTA ATGACGTTCT TCACAGAAAT AGAAGAAACA ATTCTAAGAT 2660 

TTGTACAGAA CCACAAAAGA CCCAGAATAG CCAAAGCTAT CCTGACCAAA AAGAACAAAA CTGGAAGCAT 27 30 

CACATTACCT GACTTCAAAT TATACTACAA AGCTATAGTA ACCCAAACTA CATGGTACTG GCATAAAAAC 2800 

AGATGAGACA TGGACCAGAG GAACAGAATA GAGAATCCAG AAACAAATCC ATGCATCTAC AGTGAACTCA 2870 

TTTTTGACAA AGGTGCCAAG AACATACTTT GGGGAAAAGA TAATCTCTTC AATAAATGGT GCTGGAGGAA 2 94 0 

CTGGATATCC ATATGCAAAA TAACAATACT AGAACTCTGT CTCTCACCAT ATACAAAAGC AAATCAAAAT 3010 

GGATGAAAGG CTTAAATCTA AAACCTCAAA CTTTGCAACT ACTAAAAGAA AACACCGGAG AAACTCTCCA 3080 

GGACATTGGA GTGGGCAAAG ACTTCTTGAG TAATTCCCTG CAGGCACAGG CAACCAAAGC AAAAACAGAC 3150 

AAATGGGATC ATATCAAGTT AAAAAGCTTC TGCCCAGCAA AGGAAACAAT CAACAAAGAG AAGAGACAAC 32 20 

CCACAGAATG GGAGAATATA TTTGCAAACT ATTCATCTAA CAAGGAATTA ATAACCAGTA TATATAAGGA 3290 

GCTCAAACTA CTCTATAAGA AAAACACCTA ATAAGCTGAT TTTCAAAAAT AAGCAAAAGA TCTGGGTAGA 3360 

CATTTCTCAA AATAAGTCAT ACAAATGGCA AACAGGCATC TGAAAATGTG CTCAACACCA CTGATCATCA 34 30 

GAGAAATGCA AATCAAAACT ACTATGAGAG ATCATCTCAT. CCCAGTTAAA ATGGCTTTTA TTCAAAAGAC 3500 

AGGCAATAAC AAATGCCAGT GAGGATGTGG ATAAAAGGAA ACCCTTGGAC ACTGTTGGTG GGAATGGAAA 3570 

TTGCTACCAC TATGGAGAAC AGTTTGAAAG TTCCTCAAAA AACTAAAAAT AAAGCTACCA TACAGCAATC 3 64 0 

CCATTGCTAG GTATATACTC CAAAAAAGGG AATCAGTGTA TCAACAAGCT ATCTCCACTC CCACATTTAC 3710 

TGCAGCACTG TTCATAGCAG CCAAGGTTTG GAAGCAACCT CAGTGTCCAT CAACAGACGA ATGGAAAAAG 37 8 0 

AAAATGTGGT GC AC AT AC AC AATGGAGTAC TACGCAGCCA TAAAAAAGAA TGAGATCCTG TCAGTTGCAA 38 50 

CAGCATGGGG GGCACTGGTC AGTATGTTAA GTGAAATAAG CCAGGCACAG AAAGACAAAC TTTTCATGTT 392 0 

CTCCCTTACT TGTGGGAGCA AAAATTAAAA CAATTGACAT AGAAATAGAG GAGAATGGTG GTTCTAGAGG 39 90 

GGTGGGGGAC AGGGTGACTA GAGTCAACAA TAATTTATTG TATGTTTTAA AATAACTAAA AGAGTATAAT 4 0 60 

TGGGTTGTTT GTAACAC AAA GAAAGGATAA ATGCTTGAAG GTGACAGATA CCCCATTTAC CCTGATGTGA 4130 

TTATTACACA TTGTATGCCT GTATCAAAAT ATCTCATGTA TGCTATAGAT ATAAACCCTA CT AT ATT AAA 4 2 00 

AATTAAAATT TTAATGGCCA GGCACGGTGG CTCATGTCCG TAATCCCAGC ACTTTGGGAG GCCGAGGCGG 4 270 

GTGGATCACC TGAGGTCAGG AGTTTGAAAC CAGTCTGGCC ACCATGATGA AACCCTGTCT CTACTAAAGA 4 340 

TACAAAAATT AGCCAGGCGT GGTGGCACAT ACCTGTAGTC CCAACTACTC AGGAGGCTGA GACAGGAGAA 4 4 10 

TTGCTTGAAC CTGGGAGGCG GAGGTTGCAG TGAGCCGAGA TCATGCCACT GCACTGCAGC CTGGGTGACA 4 4 80 

GAGCAAGACT CCATCTCAAA AC AAAAA C AA AA AAAAG A AG ATTAAAATTG TAATTTTTAT GTACCGTATA 4 550 

AATATATACT CTACTATATT AGAAGTTAAA AATTAAAACA AT TAT AAAAG GTAATTAACC ACTTAATCTA 4 620 

AAATAAGAAC AATGTATGTG GGGTTTCTAG CTTCTGAAGA AGTAAAAGTT ATGGCCACGA TGGCAGAAAT 4 690 

GTGAGGAGGG AACAGTGGAA GTTACTGTTG TTAGACGCTC ATACTCTCTG TAAGTGACTT AATTTTAACC 4 7 60 

AAAGACAGGC TGGGAGAAGT TAAAGAGGC A TTCTATAAGC CCTAAAACAA CTGCTAATAA TGGTGAAAGG 4 8 30 

TAATCTCTAT TAATTACCAA TAATTACAGA TATCTCTAAA ATCGAGCTGC AGAATTGGCA CGTCTGATCA 4 900 

CACCGTCCTC TCATTCACGG TGCTTTTTTT CTTGTGTGCT TGGAGATTTT CGATTGTGTG TTCGTGTTTG 3 970 

GTTAAACTTA ATCTGTATGA ATCCTGAAAC GAAAAATGGT GGTGATTTCC TCCAGAAGAA TTAGAGTACC 5040 

TGGCAGGAAG CAGGTGGCTC TGTGGACCTG AGCCACTTCA ATCTTCAAGG GTCTCTGGCC AAGACCCAGG 5110 
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TGCAAGGCAG AGGCCTGATG ACCCGAGGAC AGGAAAGCTC GGATGGGAAG GGGCGATGAG AAGCCTGCCT 5180 
CGTTGGTGAG CAGCGCATGA AGTGCCCTTA TTTACGCTTT GCAAAGATTG CTCTGGATAC CATCTGGAAA 52 50 
AGGCGGCCAG CGGGAATGCA AGGAGTCAGA AGCCTCCTGC TCAAACCCAG GCCAGCAGCT ATGGCGCCCA 5320 
CCCGGGCGTG TGCCAGAGGG AGAGGAGTCA AGGCACCTCG AAGTATGGCT TAAATCTTTT TTTCACCTGA 5390 
AGCAGTGACC AAGGTGTATT CTGAGGGAAG CTTGAGTTAG GTGCCTTCTT TAAAACAGAA AGTCATGGAA 54 60 
GCACCCTTCT CAAGGGAAAA CCAGACGCCC GCTCTGCGGT CATTTACCTC TTTCCTCTCT CCCTCTCTTG 5530 
CCCTCGCGGT TTCTGATCGG GACAGAGTGA CCCCCGTGGA GCTTCTCCGA GCCCGTGCTG AGGACCCTCT 5600 
TGCAAAGGGC TCCACAGACC CCCGCCCTGG AGAGAGGAGT CTGAGCCTGG CTTAATAACA AACTGGGATG 567 0 
TGGCTGGGGG CGGACAGCGA CGGCGGGATT CAAAGACTTA ATTCCATGAG TAAATTCAAC CTTTCCACAT 57 4 0 
CCGAATGGAT TTGGATTTTA TCTTAATATT TTCTTAAATT TCATCAAATA ACATTCAGGA CTGCAGAAAT 5810 
CCAAAGGCGT AAAACAGGAA CTGAGCTATG TTTGCCAAGG TCCAAGGACT TAATAACCAT GTTCAGAGGG 5880 
ATTTTTCGCC CTAAGTACTT TTTATTGGTT TTCATAAGGT GGCTTAGGGT GCAAGGGAAA GTACACGAGG 5 950 
AGAGGCCTGG GCGGCAGGGC TATGAGCACG GCAGGGCCAC CGGGGAGAGA GTCCCCGGCC TGGGAGGCTG 6020 
ACAGCAGGAC CACTGACCGT CCTCCCTGGG AGCTGCCACA TTGGGCAACG CGAAGGCGGC CACGCTGCGT 6090 
GTGACTCAGG ACCCCATACC GGCTTCCTGG GCCCACCCAC ACTAACCCAG GAAGTCACGG AGCTCTGAAC 6160 
CCGTGGAAAC GAACATGACC CTTGCCTGCC TGCTTCCCTG GGTGGGTCAA GGGTAATGAA GTGGTGTGCA 62 30 
GGAAATGGCC ATGTAAATTA CACGACTCTG CTGATGGGGA CCGTTCCTTC CATCATTATT CATCTTCACC 6300 
CCCAAGGACT GAATGATTCC AGCAACTTCT TCGGGTGTGA CAAGCCATGA CAAAACTCAG TACAAACACC 637 0 
ACTCTTTTAC TAGGCCCACA GAGCACGGSC CACACCCCTG ATATATTAAG AGTCCAGGAG AGATGAGGCT 64 4 0 
GCTTTCAGCC ACCAGGCTGG GGTGACAACA GCGGCTGAAC AGTCTGTTCC TCTAGACTAG TAGACCCTGG 6510 
CAGGCACTCC CCCAGATTCT AGGGCCTGGT TGCTGCTTCC CGAGGGCGCC ATCTGCCCTG GAGACTCAGC 658 0 
CTGGGGTGCC ACACTGAGGC CAGCCCTGTC TCCACACCCT CCGCCTCCAG GCCTCAGCTT CTCCAGCAGC 6650 
TTCCTAAACC CTGGGTGGGC CGTGTTCCAG CGCTACTGTC TCACCTGTCC CACTGTGTCT TGTCTCAGCG 67 2 0 
ACGTAGCTCG CACGGTTCCT CCTCACATGG" GGTGTCTGTC TCCTTCCCCA ACACTCACAT GCGTTGAAGG 67 90 
GAGGAGATTC TGCGCCTCCC AGACTGGCTC CTCTGAGCCT GAACCTGGCT CGTGGCCCCC GATGCAGGTT 68 60 
CCTGGCGTCC GGCTGCACGC TGACCTCCAT TTCCAGGCGC TCCCCGTCTC CTGTCATCTG CCGGGGCCTG 6930 
CCGGTGTGTT CTTCTGTTTC TGTGCTCCTT TCCACGTCCA GCTGCGTGTG TCTCTGCCCG CTAGGGTCTC 7 000 
GGGGTTTTTA TAGGCATAGG ACGGGGGCGT GGTGGGCCAG GGCGCTCTTG GGAAATGCAA CATTTGGGTG 7 070 
TGAAAGTAGG AGTGCCTGTC CTCACCTAGG TCCACGGGCA CAGGCCTGGG GATGGAGCCC CCGCCAGGGA 7140 
CCCGCCCTTC TCTGCCCAGC ACTTTCCTGC CCCCCTCCCT CTGGAACACA GAGTGGCAGT TTCCACAAGC 7 210 
ACTAAGCATC CTCTTCCCAA AAGACCCAGC ATTGGCACCC CTGGACATTT GCCCCACAGC CCTGGGAATT 7 280 
CACGTGACTA CGCACATCAT GTACACACTC CCGTCCACGA CCGACCCCCG CTGTTTTATT TTAATAGCTA 7 350 
CAAAGCAGGG AAATCCCTGC TAAAATGTCC TTTAACAAAC TGGTTAAACA AACGGGTCCA TCCGCACGGT 7 420 
GGACAGTTCC TCACAGTGAA GAGGAACATG CCGTTTATAA AGCCTGCAGG CATCTCAAGG GAATTACGCT 7 4 90 
GAGTCAAAAC TGCCACCTCC ATGGGATACG TACGCAACAT GCTCAAAAAG AAAGAATTTC ACCCCATGGC 7 560 
AGGGGAGTGG TTAGGGGGGT TAAGGACGGT GGGGGCGGCA GCTGGGGGCT ACTGCACGCA CCTTTTACTA 7 630 
AAGCCAGTTT CCTGGTTCTG ATGGTATTGG CTCAGTTATG GGAGACTAAC CATAGGGGAG TGGGGATGGG 7 7 00 
GGAACCCGGA GGCTGTGCCA TCTTTGCCAT GCCCGAGTGT CCTGGGCAGG ATAATGCTCT AGAGATGCCC 7 770 
ACGTCCTGAT TCCCCCAAAC CTGTGGACAG AACCCGCCCG GCCCCAGGGC CTTTGCAGGT GTGATCTCCG 78 4 0 
TGAGGACCCT GAGGTCTGGG ATCCTTCGGG ACTACCTGCA GGCCCGAAAA GTAATCCAGG GGTTCTGGGA 7 910 
AGAGGCGGGC AGGAGGGTCA GAGGGGGGCA GCCTCAGGAC GATGGAGGCA GTCAGTCTGA GGCTGAAAAG 7 980 
GGAGGGAGGG CCTCGAGCCC AGGCCTGCAA GCGCCTCCAG AAGCTGGAAA AAGCGGGGAA GGGACCCTCC 8050 
ACGGAGCCTG CAGCAGGAAG GCACGGCTGG CCCTTAGCCC ACCAGGGCCC ATCGTGGACC TCCGGCCTCC 8120 
GTGCCATAGG AGGGCACTCG CGCTGCCCTT CTAGCATGAA GTGTGTGGGG ATTTGCAGAA GCAACAGGAA 8190 
ACCCATGCAC TGTGAATCTA GGATTATTTC AAAACAAAGG TTTACAGAAA CATCCAAGGA CAGGGCTGAA 82 60 
GTGCCTCCGG GCAAGGGCAG GGCAGGCACG AGTGATTTTA TTTAGCTATT TTATTTTATT TACTTACTTT 8 3 30 
CTGAGACAGA GTTATGCTCT TGTTGCCCAG GCTGGAGTGC AGCGGCATGA TCTTGGCTCA CTGCAACCTC 8400 
CGTCTCCTGG GTTCAAGCAA TTCTCGTGCC TCAGCCTCCC AAGTAGCTGG GATTTCAGGC GTGCACCACC 8470 
ACACCCGGCT AATTTTGTAT TTTTAGTAGA GATGGGCTTT CACCATGTTG GTCAAGCTGA TCTCAAAATC 854 0 
CTGACCTCAG GTGATCCGCC CACCTCAGCC TCCCAAAGTG CTGGGATTAC AGGCATGAGC CACTGCACCT 8 610 
GGCCTATTTA ACCATTTTAA AACTTCCCTG GGCTCAAGTC ACACCCACTG GTAAGGAGTT CATGGAGTTC 8 680 
AATTTCCCCT TTACTCAGGA GTTACCCTCC TTTGATATTT TCTGTAATTC TTCGTAGACT GGGGATACAC 87 50 
CGTCTCTTGA CATATTCACA GTTTCTGTGA CCACCTGTTA TCCCATGGGA CCCACTGCAG GGGCAGCTGG 8820 
GAGGCTGCAG GCTTCAGGTC CCAGTGGGGT TGCCATCTGC CAGTAGAAAC CTGATGTAGA ATCAGGGCGC 8890 
AAGTGTGGAC ACTGTCCTGA ATCTCAATGT CTCAGTGTGT GCTGAAACAT GTAGAAATTA AAGTCCATCC 8960 
CTCCTACTCT ACTGGGATTG AGCCCCTTCC CTATCCCCCC CCAGGGGCAG AGGAGTTCCT CTCACTCCTG 90 30 
TGGAGGAAGG AATGATACTT TGTTATTTTT CACTGCTGGT ACTGAATCCA CTGTTTCATT TGTTGGTTTG 9100 
TTTGTTTTGT TTTGAGAGGC GGTTTCACTC TTGTTGCTCA GGCTGGAGGG AGTGCAATGG CGCGATCTTG 917 0 
GCTTACTGCA GCCTCTGCCT CCCAGGTTCA AGTGATTCTC CTGCTTCCGC CTCCCATTTG GCTGGGATTA 924 0 
CAGGCACCCG CCACCATGCC CAGCTAATTT TTTGTATTTT TAGTAGAGAC GGGGGTGGGT GGGGTTCACC 9 310 
ATGTTGGCCA GGCTGGTCTC GAACTTCTGA CCTCAGATGA TCCACCTGCC TCTGCCTCCT AAAGTGCTGG 9380 
GATTACAGGT GTGAGCCACC ATGCCCAGCT CAGAATTTAC TCTGTTTAGA AACATCTGGG TCTGAGGTAG 94 50 
GAAGCTCACC CCACTCAAGT GTTGTGGTGT TTTAAGCCAA TGATAGAATT TTTTTATTGT TGTTAGAACA 9 520 
CTCTTGATGT TTTACACTGT GATGACTAAG ACATCATCAG CTTTTCAAAG ACACACTAAC TGCACCCATA 9 5 90 
ATACTGGGGT GTCTTCTGGG TATCAGCAAT CTTCATTGAA TGCCGGGAGG CGTTTCCTCG CCATGCACAT 9 660 
GGTGTTAATT ACTCCAGCAT AATCTTCTGC TTCCATTTCT TCTCTTCCCT CTTTTAAAAT TGTGTTTTCT 9730 
ATGTTGGCTT CTCTGCAGAG AACCAGTGTA AGCTACAACT TAACTTTTGT TGGAACAAAT TTTCCAAACC 9800 
GCCCCTTTGC CCTAGTGGCA GAGACAATTC ACAAACACAG CCCTTTAAAA AGGCTTAGGG ATCACTAAGG 987 0 
GGATTTCTAG AAGAGCGACC TGTAATCCTA AGTATTTACA AGACGAGGCT AACCTCCAGC GAGCGTGACA 9 94 0 
GCCCAGGGAG GGTGCGAGGC CTGTTCAAAT GCTAGCTCCA TAAATAAAGC AATTTCCTCC GGCAGTTTCT 10010 
GAAAGTAGGA AAGGTTACAT TTAAGGTTGC GTTTGTTAGC ATTTCAGTGT TTGCCGACCT CAGCTACAGC 1008 0 
ATCCCTGCAA GGCCTCGGGA GACCCAGAAG TTTCTCGCCC CCTTAGATCC AAACTTGAGC AACCCGGAGT 10150 
CTGGATTCCT GGGAAGTCCT CAGCTGTCCT GCGGTTGTGC CGGGGCCCCA GGTCTGGAGG GGACCAGTGG 10220 
CCGTGTGGCT TCTACTGCTG GGCTGGAAGT CGGGCCTCCT AGCTCTGCAG TCCGAGGCTT GGAGCCAGGT 10290 
GCCTGGACCC CGAGGCTGCC CTCCACCCTG TGCGGGCGGG ATGTGACCAG ATGTTGGCCT CATCTGCCAG 10 360 
ACAGAGTGCC GGGGCCCAGG GTCAAGGCCG TTGTGGCTGG TGTGAGGCGC CCGGTGCGCG GCCAGCAGGA 104 30 
GCGCCTGGCT CCATTTCCCA CCCTTTCTCG ACGGGACCGC CCCGGTGGGT GATTAACAGA TTTGGGGTGG 10500 
TTTGCTCATG GTGGGGACCC CTCGCCGCCT GAGAACCTGC AAAGAGAAAT GACGGGCCTG TGTCAAGGAG 10 570 
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CCCAAGTCGC 
GTCCTCGGGT 
CCGGAGCCCG 
GGTCGCCGCA 
GCCGATTCGA 
GGCGGGGAAG 
CCAGTGGATT 
ACCCGTCCTG 
GGGTCCCCGG 
TCGCGGCGCG 
GCG ATG CCGC 
CGCTGGCCAC 
TTTCCGCGCG 
TCCTTCCGCC 
GACATGCGGA 
GCCCGAGTGC 
ACGGGGCCCG 
CGACGCACTG 
CTGCTGGCAC 
TGTACCAGCT 
ATGCGAACGG 
AGGAGGCGCG 
AGCCGGAGCG 
TGGTTTCTGT 
ACGCGCCACT 
GTCCCTGGGA 
GGAGCAGCTG 
GAGACCATCT 
GCTACTGGCA 
CCTCAAGACG 
CAGGGCTCTG 
ACAGCAGCCC 
GGGCTCCAGG 
AAGCTCTCGC 
GTGAGGAGGT 
AGGCAGAGCC 
TGGACACGGT 
CGTTTTGATG 
TGGAGCCGGG 
TTACCTATAA 
GGGTGGGAGG 
GTGTGTTGAC 
TCACCTGAGG 
AAATTAGCTG 
TGAACCCAGG 
GAAACTCTGT 
ACTGTTCTCC 
AGATGGCTCC 
CGTGTCCCCA 
TGCTCCCAGG 
ATGTAAGACT 
TTATGGTGGC 
TGCTAACTCG 
ATCGAACGGC 
ACCCAGTTTT 
CCTGTCATTG 
TGGTCCCCGG 
GGGTGAGTGA 
CCCTGTCACG 
TGGTCCCCGG 
GGGTGAGTGA 
CCCTGTCACG 
CTGTCCCCGG 
GGGTGAGTGA 
CCTTGGCGTT 
CCCATTGCCT 
TCTTGGTCAC 
GGCACTGCAG 
ATGCATGCTG 
CTGCCACGTG 
CCCCTCACTT 
TCACCTTATT 
ATTGACGTCC 
GTCTCCGCCA 
TTTCT ATCTC 
CATGCCTTTC 
CACTTTCAAG 
TTTCTTTGTG 



GGGGAAGTGT 
TCGTCCCCAG 
ACGCCCCGCG 
CGCACCTGTT 
CCTCTCTCCG 
CGCGGCCCAG 
CGCGGGCACA 
CCCCTTCACC 
CCCAGCCCCC 
AGTTTCAGGC 
GCGCTCCCCG 
GTTCGTGCGG 
CTGGTGGCCC 
AGGTGGGCCT 
GAGCAGCGCA 
TGCAGAGGCT 
CGGGGGCCCC 
CGGGGGAGCG 
GCTGCGCGCT 
CGGCGCTGCC 
GCCTGGAACC 
GGGGCAGTGC 
GACGCCCGTT 
GTGGTGTCAC 
CCCACCCATC 
CACGCCTTGT 
CGGCCCTCCT 
TTCTGGGTTC 
AATGCGGCCC 
CACTGCCCGC 
TGGCGGCCCC 
CTGGCAGGTG 
CACAACGAAC 
TGCAGGAGCT 
GGTGGCCGTC 
CTGGTCCTCC 
GATCTCTGCC 
GACACGCGGT 
TTGCCGGCAA 
TCCTCTTCGC 
TAAGGGTTTT 
GGCCAGGTGC 
TCAGGAGTTT 
GGCATGGTGG 
AGGCGGAGGC 
CTTTAAAAAA 
AGCACAGATC 
ACCTGCTGAG 
CCCTGTTTTT 
CCCTACCGTG 
TCCGGCCATG 
AAAAGTCATA 
GCGGTGTTTA 
AGCTGCCTCA 
GCTTTTTGTG 
CTGTTCTCTG 
GTGTCCCTGT 
GGCGCGGCCC 
TGTAGGGTGA 
GTGTCCCTGT 
GGCGCGGTCC 
TGTAGGGTGA 
GTGTCCCTGT 
GGCGCTGTCC 
TGCTCACTTG 
GGGTAGATGG 
CTCTCCGTTC 
CCACAGCTTC 
CCAATACTCC 
TTGCTGGAGA 
GTCCTGTTTT 
CTGGGCACCT 
AGCCACAGGT 
GCCTTCGTCA 
TCCATTGTAT 
CCTCTAASTG 
TGTTCTTAAA 
CACGCTGTGT 



TGCAGGGAGG 
CCGCGTCTAC 
TCCGGACCTG 
CCCAGGGCCT 
CTGGGGCCCT 
ACCCCCGGGT 
GACGCCCAGG 
TTCCAGCTCC 
TCCGGGCCCT 
AGCGCTGCGT 
CTGCCGAGCC 
CGCCTGGGGC 
AGTGCCTGGT 
CCCCGGGGTC 
GGCGACTCAG 
GTGCGAGCGC 
CCCGAGGCCT 
GGGCGTGGGG 
CTTTGTGCTG 
ACTCAGGCCC 
ATAGCGTCAG 
CAGCCGAAGT 
GGGCAGGGGT 
CTGCCAGACC- 
CGTGGGCCGC 
CCCCCGGTGT 
TCCTACTCAG 
CAGGCCCTGG 
CTGTTTCTGG 
TGCGAGCTGC 
CGAGGAGGAG 
TACGGCTTCG 
GCCGCTTCCT 
GACGTGGAAG 
GAGGGCCCAG 
TGTCTCCATC 
TCTGCTCTCC 
TTCCAGGCGC 
TGGGGAGAAG 
AATTTCAAGG 
GCAGGTGCAC 
GGTGGCTCAC 
GAGACCAGCC 
TGTGTGCCTG 
TGCAGTGAGC 
AAAAAGTGTT 
CTGGTCCCAT 
GAAGGGACAG 
CTGGATTTGA 
GCAGCTAGAA 
CAGACAAGGA 
TAACATGAGA 
CAGCAGGTTG 
CACCTGCTGC 
CTCCAGCTTC 
ACTTCAGATG 
CACGTGCAGG 
CCGGGTGTCC 
GTGAGGCGCC 
CCCGTGCAGG 
CCGGGTGTCC 
GTGAGGCACC 
CACGTGCAGG 
CTGGGTGTCC 
AGCTTGCTCC 
TGCAGGCGCA 
CATTTTGCTA 
AGGTCCGCTT 
TCTCCCAGCT 
CATCCCAGAA 
CTCCCAAGCT 
GCCGCTCATT 
TGGAGTGTCT 
GACTTCCCTC 
GCTTTTTCTT 
CTGCCTTACC 
ATACTTCAAA 
TTTGACGTGA 



CACTCCGGGA 
GCGCCTCCGT 
GAGGCAGCCC 
CCACATCATG 
CGCTGGCGTC 
CCGCCCGGAG 
ACCGCGCTCC 
GCCTCCTCCG 
CCCAGCCCCT 
CCTGCTGCGC 
GTGCGCTCCC 
CCCAGGGCTG 
GTGCGTGCCC 
GGCGTCCGGC 
GGCGCTTCCC 
GGCGCGAAGA 
TCACCACCAG 
GCTGCTGCTG 
GTGGCTCCCA 
GGCCCCCGCC 
GGAGGCCGGG 
CTGCCGTTGC 
CCTGGGCCCA 
CGCCGAAGAA 
CAGCACCACG 
ACGCCGAGAC 
CTCTCTGAGG 
ATGCCAGGGA 
AGCTGCTTGG 
GGTCACCCCA 
GACACAGACC 
TGCGGGCCTG 
CAGGAACACC 
ATGAGCGTGC 
GCCCCAGAGC 
GTCACGTGGG 
CTCCTGTCCA 
CGAGGCCAGA 
TGTCTGGAAG 
GTGGGAATGA 
GTGGTCAGCC 
GCCGGTAATC 
TGACCAACAT 
TAATCCCAGC 
TGAGATTGTG 
CGTTGATTGT 
CTTTAGGTAT 
TGTTTGTGGG 
TGTTGAGGAA 
GAAGTCCCGA 
GGGTGACCTT 
TTGGCACTCC 
CTTGAAATGC 
GGCTCAGGTG 
CTTCGTTGAG 
AGGTCACAAT 
GTGAGTGAGG 
CTGTCCCGTG 
ATCCCCGGGT 
GTGAGTGAGG 
CTCTCAGGTG 
GTCCCTGGGT 
GTGAGTGAGG 
CTGTCTCGTG 
TGAATGTTTG 
GTGCTGGTCC 
CGGGGACACG 
GCCTCTGTTG 
TGTCTCATGC 
AGGGTTCTCT 
GCCCCTCTGC 
GCTTAGGCTG 
CTGTCTGTCT 
TTGGGTCTTA 
GGTTTATTCT 
TGCACCCTGT 
GTGTTAATAC 
AATCATTTTG 



GGTCCCGCGT 
CCTCCCCTTC 
TGGGTCTCCG 
GCCCCTCCCT 
CCTGCACCCT 
CAGCTGCGCT 
CCACGTGGCG 
CGCGGACCCC 
CCCCTTCCTT 
ACGTGGGAAG 
TGCTGCGCAG 
GCGGCTGGTG 
TGGGACGCAC 
TGGGGTTGAG 
CCGCAGGTGT 
ACGTGCTGGC 
CGTGCGCAGC 
CGCCGCGTGG 
GCTGCGCCTA 
ACACGCTAGT 
GTCCCCCTGG 
CCAAGAGGCC 
CCCGGGCAGG 
GCCACCTCTT 
CAGGCCCCCC 
CAAGCACTTC 
CCCAGCCTGA 
CTCCCCGCAG 
GAACCACGCG 
GCAGCCGGTG 
CCCGTCGCCT 
CCTGCGCCGG 
AAGAAGTTCA 
GGGACTGCGC 
TGAATGCAGT 
CACACGTGGC 
GTTTGCATAA 
GCAGTGAACA 
CACAGACGCT 
GAGGTGGGGA 
AATATGCAGG 
CCAGCACTTT 
GGTGAAACCC 
TACTTGGGAG 
CCATTGTACT 
GCCAGGACAG 
GAAGAGGGCC 
TGTTCAGGGG 
CCTCCGCTCC 
TTTCACCCCC 
CTTGGGGCTC 
TAACACCGTT 
TGCGTCTTGC 
GACCACGCCG 
GAGAGTTTGA 
CTGCCCCTGG 
CGTTGCCCCC 
CAGCGTGATT 
GTCCCTGTCA 
CACTGTCCCC 
TAGGGTGAGT 
GTCCCTCCCA 
CGCGGCCCCC 
TAGGGTGAGT 
CTCTTTCTAT 
CCAAGCCTAT 
GGACTGCAGG 
GGCCTGGCTT 
CGAGGCTGGA 
GTGCCCTGAA 
TTGGCCCCCT 
GGCTCTGCCT 
CCTGCTCTGA 
GTTTTGAATT 
TTCATTCCTT 
GTTTTGATGT 
TTCTTTTAAG 
ATATCAGTGA 



GCCCGTCCAG 
ACGTCCGGCA 
GATCAGGCCA 
CGGGTTACCC 
GGGAGCGCGA 
GTCGGGGCCA 
GAGGGACTGG 
GCCCCGTCCC 
TCCGCGGCCC 
CCCTGGCCCC 
CCACTACCGC 
CAGCGCGGGG 
GGCCGCCCCC 
GGCGGCCGGG 
CCTGCCTGAA 
CTTCGGCTTC 
TACCTGCCCA 
GCGACGACGT 
CCAGGTGTGC 
GGACCCCGAA 
GCCTGCCAGC 
CAGGCGTGGC 
ACGCGTGGAC 
TGGAGGGTGC 
ATCCACATCG 
CTCTACTCCT 
CTGGCGCTCG 
GTTGCCCCGC 
CAGTGCCCCT 
TCTGTGCCCG 
GGTGCAGCTG 
CTGGTGCCCC 
TCTCCCTGGG 
TTGGCTGCGC 
AGGGGCTCAG 
TTTTCGCTCA 
ACTTACGAGG 
GAGGAGGCTG 
CTGGCGAGGG 
CGAGAACCCC 
TTTGTGTTTA 
GGGAAGCTGA 
TATCTGTACT 
GCTGAGGCAG 
CCAGCCTGGG 
GGTAGAGGGA 
ACATGGGAGC 
ATGGTGCTGC 
AGCCCCCTTT 
TCCCCACAAA 
TTTTTTTTCT 
TTCTGTGTAC 
GTGACTGGAA 
AGTCAGATAA 
GTTCTCTGAT 
CTTATGCAGG 
AGGTGTCCCT 
GAGGTGTGGC 
CGTGTAGGGT 
GGGTGTCCCT 
GAGGCGCGGC 
GGTATAGGGT 
GGGTGTCCCT 
GAGGCTCTGT 
AGCCACAGCT 
CTTTTCTGAT 
CTCTCGCCTC 
GCTCACCACG 
CTCTGGGCTG 
GGAAAGCAAG 
TGGGTGGGTG 
CCAGTCGCCC 
GACCCACGTG 
TCACTGATTT 
TTCTAGCTTC 
GAAGTAATCT 
TATTCTTATT 
CTTTTAAGTA 



GGAGCAATGC 
TTCGTGGTGC 
GCGGCCAAAG 
CACAGCCTAG 
GCGGCGCGCG 
GGCCGGGCTC 
GGACCCGGGC 
GACCCCTCCC 
CGCCCTCTCC 
GGCCACCCCC 
GAGGTGCTGC 
ACCCGGCGGC 
CGCCGCCCCC 
GGGAACCAGC 
GGAGCTGGTG 
GCGCTGCTGG 
ACACGGTGAC 
GCTGGTTCAC 
GGGCCGCCGC 
GGCGTCTGGG 
CCCGGGTGCG 
GCTGCCCCTG 
CGAGTGACCG 
GCTCTCTGGC 
CGGCCACCAC 
CAGGCGACAA 
GAGGCTCGTG 
CTGCCCCAGC 
ACGGGGTGCT 
GGAGAAGCCC 
CTCCGCCAGC 
CAGGCCTCTG 
GAAGCATGCC 
AGGAGCCCAG 
AAAAGGGGGC 
GGACGTCGAG 
TTCACCTTCA 
GGCGCGGCAG 
TGCCTGCAGG 
CTCTTCCTGG 
AGATTTAATT 
GGCAGGTGGA 
AAAAATACAA 
GAGAATCACT 
CGACAAGAGT 
GGGAGATAAG 
AGAGGACAGC 
TGGGCCCTGC 
TGGCTCCCAG 
CTCCCAAGAC 
TTTTTTCTTT 
AGTGCAGAAT 
GTCCCTACCC 
GCGTCATGCA 
CAGGACTCTG 
GAGTGAGGCG 
GTCACGTGTA 
CCCCGGGTGT 
GAGTGAGGCG 
GTCACGTGCA 
CCCAGGGTGT 
GAGTGAGGCA 
CTCAGGTGCA 
CCCCAGGTGT 
GCGCCGGTTG 
GCTCGGCTCT 
CCGCGTGCCA 
TGCCCGCCAC 
CCTGTGTCTG 
TCACCCCAGC 
GCAACGCTTG 
CCTCACATGG 
GAGGGCCGGT 
ACCTCTGACG 
TTAGTTTAGT 
CAACATCAGC 
CTGTGATTTT 
TTCTTTAGCT 



10640 
10710 
10780 
10850 
10920 
10990 
11060 
11130 
11200 
11270 
11340 
11410 
11480 
11550 
11620 
11690 
11760 
11830 
11900 
11970 
12040 
12110 
12180 
12250 
12320 
12390 
12460 
12530 
12600 
12670 
12740 
12810 
12880 
12950 
13020 
13090 
13160 
13230 
13300 
13370 
13440 
13510 
13580 
13650 
13720 
13790 
13860 
13930 
14000 
14070 
14140 
14210 
14280 
14350 
14420 
14490 
14560 
14630 
14700 
14770 
14840 
14910 
14980 
15050 
15120 
15190 
15260 
15330 
15400 
15470 
15540 
15610 
15680 
15750 
15820 
15890 
15960 
16030 
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TATTCTGTGA TTTCTTTGAG CAGTGAGTTA TTTGAACACT 
ATACGTAGAG TATTTTAAGT TATCATTTTA TTATTGATTT 
ACCAATTATT TGAAGTTTGC GGAGCCTTGC TTTGTGATCT 
TGTAAATTTG ACATCCTGTC AATAGTGGGC ATGCATGTTC 
AAGCTTCTGT CTCCTTCTAG ATGCATGAAA TTCCAAGAAG 
TCTGTTCATT TCTTCTCGTT TGGTAGCATT TATGTGAGGC 
TTATCTTCCT GATGAGTGAA TCTTTTGGAG ACTTCTATGT 
TTGCTCTTAG TACTGCCACA CTGGGCTTCT TTTGATTAGT 
AATTTATATA TATATATATA TTTTTTTTTT TTTTGAGACA 
AGTGGTGTGA TCACAGGTCA GTGTAACTTT TACCTTCTGG 
AGTAGCTGGA ACTGCAGACA CGCACCGCTA CACCTGGCTA 
TGCTGTGTTG CCCAGGCTGG TCTCAAACTC TTGGACTCAA 
CTGAATTACA GGCATGAGCC ACCATGTCTG GCCTAATTTT 
TGTCCTGTTA ACAGCATGTA GGTGAATTTC CAATCCAGTC 
TTTATTTTCA TTTTTTTGTC ACTAGAGACC CGCCTGGTGC 
CCTCGTTCCC TTGTTTCTCA CCACCTCTTG GGTTGCCATG 
CTCGTTGCCT CCTGGTCACT GGGCATTTGC TTTTATTTCT 
ATTGTCGTTG TTTGCTTTTG TTTATTGAGA CAGTCTCACT 
ATCTCGGCTC ACTGCAACCT CTGCCTCCTC GGTTCAAGCA 
GGATTACAGG CGCCCACCAC CACGCCTGGC TAATTTTTGT 
TGGCCAGGCT GGTCTCAAAC TCCTGACCTC AAGTGATCTG 
ACAGGTGCAA GCCACCGTGC CCGGCATACC TTGATCTTTT 
TCCTGAGCAA TAAGACCCTT AGTGTATTTT AGCTCTGGCC 
TGACTTAGTT CTATCTCAGG CATCTTGACA" CCCCCACAAG 
GAGTGTTTCT GTAGCTTTGC CCCCGCCCTG CTTTTCCTCC 
CCGCCGTCTG GGGTCCCCTT CCTTGTCCTT TGCGTGGTTC 
CTTTACCTGT GCTGGCCTCC ATGGCATCTA GCGACGTCCG 
ATGTGGAGAC TCACGAGGAG GGCGGTCATC TTGGCCCGTG 
CTTAGCCAGT GAGTGACAGC AACGTCCGCT CGGCCTGGGT 
TCTGGTGGCT CCGCGGTGTC GAGTTTGAAA TCGCGCAAAC 
GCCTGGCGGG GGAGTGTCTG CTTCCTCCCT TCTGCTTGGG 
TTGTCGCCCA ACAGGAGCAT GACGTGAGCC ATGTGGATAA 
TCACGCCTGT AATCCCAGCA CTTTGGGAGG CCAAGGCGGG 
CCTGGCCAAC ATGATGAAAC CCCATCTGTA CTAAAAACAC 
TGTAATCCCA GCTACTCGGG AGGCTGAGGC AGGAGAATTG 
GCCGACATTG CACCACTGCA CTCCAGCCTG GCAACACAGC 
AAAAAAAAAA AATTCTAGTA GCCACATTAA AAAAGTAAAA 
TTTTACTGAA GCCCAGCATG TCCACACCTC ATCATTTTAG 
ACATTTGACA TTTTTTGAGC TTTGTCTGCG GGATCCCGTG 
GGACCTGCTG GGCTTCCCAT GGCCATGGCT GTTGTACCAG 
CCCTCAGTGA GCTGGATGTG CAGTGTCCGG ATGGTGCACG 
AGCTGGATGT GTGGTGTCTG GATGGTGCAG GTCAGGGGTG 
ATGGAGTCCG GATGATGCAG GTCCGGGGTG AGGTCGCCAG 
GGATGGTGCA GGTCAGGGGT GAGGTCTCCA GGCCCTCGGT 
GGTCCGGGGT GAGGTCGCCA GGCCCTGCTG TGAGCTGGAT 
TGAGGTCACC AGGCCCTGCG GTGAGCTGGG TGTGCGGTGT 
CAGACGGTGC CAGACCATGC GGTGAGCTGG ATATGCGGTG 
CCAGGCCCTG CTGTGAGTTG GATGTGGGGT GTCCGGATGC 
GCTGTGAGCT GGATGTGTGG TGTCTGGATG GTGCAGGTCT 
AGCTGGATGT GTGGTGTCTG GATGGTGCAG GTCTGGAGTG 
GCAGTGTCCA GATGGTGCAG GTCCGGGGTG AGGTCGCCAG 
GGATGGTGCA GGTCTGGAGT GAGGTCGCCA GGCCCTCGGT 
GGTCCGGGGT GAGGTCGCCA GACCCTGCTG TGAGCTGGAT 
TGAGGTCGCC AGACCCTGCT GTGAGCTGGA TATGCGGTGT 
CAGGCCCTCG GTGAGCTGGA GGTATGGAGT CCGGATGATG 
TGTGAACTGG ATGTGCGGCG TCTGGATGGT GCAGGTCTGG 
AGGTATGGAG TCCGGATGAT GCAGGTCCGG GGTGAGGTCG 
GTCTGGATGG TGCAGGTCTG GGGTGTGGTC GCCAGGCCCT 
TGCAGGTCCG GGGTGAGGTT GCCAGGCCCT GCTGTGAGCT 
GGGTGAGGTC GCCAGGCCCT GCTGTGAGCT GGATGTGCTG 
CACCAGGCCC TGCGGTGAGC TGGTTGTGCG GTGTCCGGTT 
CTCGGTGAGC TGGATGTGCG GTGTCCCCGT GTCCGGATGG 
TGGTGGGCTG GATGTGCCGT GTCCGGATGG TGCAGGTCTG 
GATGTGCGGT GTCTGCATGG TGCAGGTCTG GGGTGAGGTC 
GTCCGGATGG TGCAGGTCCG GCGTGAGGTC GCCAGGCCCT 
GTGCAGGTCC GGGGTGAGGT AGCCAAGGCC TTCGGTGAGC 
CGGGGTGAGG TCGCCAGGCC CTGCGGTTAG CTGGATATGC 
GTCACCAGGC CCTGCGGTTA GCTGGATGTG CGGTGTCTGG 
CCCTGCTGTG AGCTGGATGT GCTGTATCCG GATGGTGCAG 
GAGCTGGATG TGCTGTATCC GGATGGTGCA GGTCTGGCGT 
ATGCGGTGTC GGATGGTGCA GGTCCGGGGT GAGGTCACCA 
CGGATGGTGC AGGTCTGGGG TGAGGTCGCC AGGCCCTGCT 
CAGGTCCGGG GTGAGGTCGC CAGGCCCTGC GGTGAGCTGG 
CGTGAGGTCG CCAGGCCCTG CGGTGAGCTG GATGTGCAGT 
GCCAGGCCCT GCGGTGGGCT GTATGTGTGT TGTCTGGATG 
TGCGGTGAGC TGGATGTGTG GTGTCTGGAT GCTGCAGGTC 
TGGATATGCG GTGTCCCCGT GTCCGAATGG TGCAGGTCCA 
GATGTGCCGT GTCCGGATGG TGCAGGTCTG GGGTGAGGTC 



GTTTATGTTC AAGATATGTA GAGTATCAAG 16100 
CTAACTCAGT TGTGTAGTGG TCTGTATAAT 16170 
AGTGTGTGCA TGGTTTCCAG AACTGTCCAT 16240 
ACTATATCCA GCTTATTAAG GTCCAGTGCA 16310 
GAGGCCATAG TCCCTCACCT GGGGGATGGG 1638 0 
ATTGTTAGGT GCATGCACGT GGTAGAATTT 164 50 
CTCTAGTAAT CTAGTAATTC TTTTTTTAAA 16520 
ATTTTCCTGC TGTGTCTGTT TTCTGCCTTT 16590 
GAGTCTTGGT CTGTCGCCCA GGGTGAGTGC 16660 
CCTGAGCCGT CCTCTCACCT CAGCCTCCTG 167 30 
ATTTTTAAAT TTTTTCTGGA GACAGGGTCT 16800 
GGGATCCATC TACCTCGGCT TCCCAAAGTG 16870 
CAACACTTTT ATATTCTTAT AGTGTGGGTA 1694 0 
TGACAGTCGT TGTTTAACTG GATAACCTGA 17010 
ACTCTGATTC TCCACTTGCC TGTTGCATGT 17080 
TGCGTTTCCT GCCGAGTGTG TGTTGATCCT 17150 
CTTTGCTTAG TGTTACCCCC TGATCTTTTT 1722 0 
CTGTCACCCA GGCTGGAGTG . TAATGGCACA 17290 
GTTCTCATTC CTCAACCTCA TGAGTAGCTG 17 360 
ATTTTTAGTA GAGATAGGCT TTCACCATGT 17430 
CCCGCCTTGG CCTCCCACAG TGCTGGGATT 17 500 
AAAATGAAGT CTGAAACATT GCTACCCTTG 17 570 
ACCCCCCAGC CTGTGTGCTG TTTTCCCTGC 17 640 
CTAAGCATTA TTAATATTGT TTTCCGTGTT 17710 
TTTGTTCCCC GTCTGTCTTC TGTCTCAGGC 17780 
TTCTGTCTTG TTATTGCTGG TAAACCCCAG 17 850 
GGGACCTCTG CTTATGATGC ACAGATGAAG 17 920 
AGTGTCTGGA GCACCACGTG GCCAGCGTTC 17 990 
TCAGCCTGGA AAACCCCAGG CATGTCGGGG 18 060 
CTGCGGTGTG GCGCCAGCTC TGACGGTGCT 18130 
AACCAGGACA AAGGATGAGG CTCCGAGCCG 18 200 
TTTTAAAATT TCTAGGCTGG GCGCGGTGGC 18270 
TGGATCACGA GGTCAGGAGG TCGAGACCAT 18 34 0 
AAAAATTAGC TGGGCGTGGT GGCGGGTGCC 18 410 
CTTGAACCTG GGAGTTGGAA GTTGCAGTGA 18 4 80 
GAGACTCTGT CTCAAAAAAA AAAAAAAAAA 18 550 
AAGAAAAGGT GAAATTAATG TAATAATAGA 18 620 
GGTGTTATTG GTGGGAGCAT CACTCACAGG 18 690 
TGTAGGTCCC GTGCGTGGCC ATCTCGGCCT 187 60 
ATGGTGCAGG TCCGGGATGA GGTCGCCAGG 188 30 
TCTGGGATGA GGTCGCCAGG CCCTGCTGTG 18 900 
AGGTCTCCAG GCCCTCGGTG AGCTGGAGGT 18 970 
GCCCTGCTGT GAGCTGGATG TGTGGTGTCT 19040 
AAGCTGGAGG TATGGAGTCC GGATGATGCA 19110 
GTGTGGTGTC TGGATGGTGC AGGTCTGGGG 19180 
CTGGATGGTG CAGGTCTGGA GTGAGGTCGC 19250 
TCCGGATGGT GCAGGTCTGG GGTGAGGTTG 19 320 
TGCAGGTCCG GTGTGAGGTC ACCAGGCCCT 19390 
GGGGTGAAGG TCGCCAGGCC CCTGCTTGTG 19 4 60 
AGGTCGCCAG GCCCTCGGTG AGCTGGATGT 19 530 
ACCCTGCGGT GAGCTGGATG TGCGGTGTCT 19600 
GAGCTGGATG TATGGAGTCC GGATGGTGCC 19670 
GTGCGGTGTC TGGATGGTAC AGGTCTGGAG 19740 
CCGGATGGTG CAGGTCAGGG GTGAGGTCTC 19810 
CAGGTCCGGG GTGAGGTCGC CAGGCCCTGC 19880 
GGTGTGGTCG CCAGGCCCTC GGTGAGCTGG 19950 
CCAGGCCCTG CTGTGAGCTG GATGTGCGGC 20020 
CGGTGAGCTG GAGGTATGGA GTCCGGATGA 20 090 
GGATGTGCTG TATCCGGATG GTGCAGTCCG 20160 
TATCCGGATG GTGCAGGTCT GGGGTGAGGT 202 30 
GCTGCAGGTC CGGGGTGAGT TCGCCAGGCC 20 300 
TGCAGGTCCA GGGTGAGGTC GCTAGGCCCT 20370 
GGGTGAGGTC GCCAGGCCTT TGGTGAGCTG 20 4 40 
GCCAGGCCCT TGGTGGGCTG GATGTGTGGT 20 510 
GCTGTGAGCT GGATGTGCGG TGTCTGGATG 2 0 580 
TGGATGTGGG GTGTCCGGAT GGTGCAGGTC 20650 
GGTGTCCGGA TGGTGCAGGT CCGGGGTGAG 20720 
ATGGTGCAGG TCCGGGGTGA GGTCGCCAGG 20790 
GTCCGGGGTG AGGTCGCCAG GCCCTGCAGT 20 8 60 
GAGGTCGCCA GGCCCTGCGG TTAGCTGGAT 20 930 
GGCCCTGCGG TTAGCTGGAT GTGCGGTGTC 21000 
GTGAGCTGGA TGTGCTGTAT CCGGATGGTG 21070 
ATGTGCTGTA TCCGGATGGT GCAGGTCTGG 21140 
GTACGGATGG TGCAGGTCCG GGGTGAGGTC 21210 
GTGCAGGTCC GGGGTGAGTT CGCCAGGCCC 21280 
CGGGGTGAGT TCGCCAGGCC CTCGGTGAGC 21350 
GGGTGAGGTC GCCAGGCCCT TGGTGGGCTG 21420 
GCCAGGCCCT TGGTGAGCTG GATGTGCGGT 21490 
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GTCCGGATGG TGCAGGTCCG GGGTGAGGTC ACCAGGCCCT 
TTTAAGGGGT TGGCTGTGTT CCGGCCGCAG AGCACCGTCT 
CTGGCTGATG AGTGTGTACG TCGTCGAGCT GCTCAGGTCT 
AAGAACAGGC TCTTTTTCTA CCGGAAGAGT GTCTGGAGCA 
TCCCCACGCC AGGCCTCTGC TTCTCGAAGT CCTGGAACAC 
CTTGCCTGTG CTTCCCTGGC TGTGCAGCTC TGGGCTGGGA 
GTGGATTCTG TGCAAGGCTC TGACTGCCTG GAGCTCACGT 
CAAGTGGTCT CTAGGGTTTG TAAAGCAGAA GGGATTTAAA 
GCCTTTCCCT GGGATGTGGG TCTGATTCTC TCTCTCTTTT 
TGTTGCCCAG GCTGGAGTGC AGTGGCATAA TCTTGGCTCA 
TTCACCAGCC TCAGCCTCCT AAGTAGCTGG GATTACAGGC 
CTTTTAGGAG AGACGGGGTT TCACCATGTT GGCCAGGCTG 
CCACCTTGGC CTCCCAAAGT GCTGGGTTTA CAGGCTAAGC 
TTCATGCTGT TCTGTATGAA TCTTCAATCT ATTGGATTTA 
GGCGACTCAC TGCAGGGAGC ACCTGTGCAG GGAGCACCTG 
TCTAGGTGGC TGCATTTGAA TGGCTGTGAG ATTTTGTCTG 
TGACAGATTC AAGCTGGATT TGCATCAGTG AGGGACGGGA 
AGCCCAGGCC ATGGTATTAG CTTCTCCGTG TCCCGCCCAG 
CAGGGCTTCC CCAGCTCCCC TGCACACTCG AGTCCCTGGG 
GGATGTCTGC AGAGGGAGCT GGCAGCAGAC CTCGTCAGAG 
CGTGGTGCTG GGGCCATTTC CTTGCATCTG GGGGAGGGTC 
ACAATGCACC TTACTTAGAC TTTACACGTA TTTAATGGTG 
TTTGGAAAGA ATTTAATTGG GGTGACCGGA AGGAGCAGAC 
CACTACTGGG ACTGTTGTTC TGCCTGGGGG GCCTTGGAGG 
TTTCTACTCT GCTGGGCCTG CGGCCTGCGG TCAGGGCACC 
ACGGAGTGCC AGGCTGTCAG CCACAGATGC CCAGGTCCAG 
GGGTGGTTTT GGGGGAAAAG GCCAAGGGCA GAGGTGTCAG 
GCTCCTTGGC TGAGCTGCCC TGAGCAGCCT CTCCCGCCCT 
CTGGGGGTCC TGCCTGGGGC CAGCCTTGGG CTACCCCAGT 
GGAGGGGCAT GGGTTCACGT GGCCCCAGAT GCAGCCTGGG 
GTCACCCTGG GGGTTGACCG CCGGACTGGG CGTCCCCAGG 
CTGCAAGTAG AGGGGCTCTC AGAGGCGTCT GGCTGGCATG 
CGTGTGCTGC CGTGGGTGCC CTGAGCCCTC ACTGAGTCGG 
CCTAGTCTGT TGTCTGGCTG AGCAAGCCTC CTGAGGGGCT 
CAGCTGCGGG AGCTGTCGGA AGCAGAGGTC AGGCAGCATC 
GACTCCGCTT CATCCCCAAG CCTGACGGGC TGCGGCCGAT 
AACGTTCCGC AGAGAAAAGA GGGTGGCTGT GCTTTGGTTr 
GCCCCACATT TGGTATCAGC T TAG AT G AAG GGCCCGGAGG 
CACGGCGCCA ACCCATTTGT GCGCACAGTG AGGTGGCCGA 
GGGTGTAGGG GGAGCTCCTG GGGCAGGGAC AGGCTCTGAG 
GATGCAGCAC GGCCCGAGGT CCTGGATCCG TGTCCTGCTG 
CGGGGCCCGG GGACCAGGCC ACGACTGCCA GGAGCCCACC 
GGCTCCTGCA CCCCACCCCT GTGGCTGCGG TGGCTGCGGT 
AGGTGGACAG AGGTGTGGCA TGAGGATCCC GTGTGCAACA 
GGTCTGAGGA AGCTGGGAGG GGTTCTAGGT CCCGGGTCTG 
TCTCCCCTGG GTCCCTATGG TGGGGTGGGC ACTTGGCCGG 
CCCCGCCAGG CCGAGCGTCT CACCTCGAGG GTGAAGGCAC 
GGCGCCCCGG CCTCCTGGGC GCCTCTGTGC TGGGCCTGGA 
GCTGCGTGTG CGGGCCCAGG ACCCGCCGCC TGAGCTGTAC 
GAGCAGCCCT GCTGGACCTT GGGAGTGGCT GCCTGATTGG 
GGGTGGGCCG CAGGGAGTGC AGGTGACCCT GTCACTGTTG 
TTCAGCCTTT CCTGCAGCAC ATGGGGCCGA CTGTGCACCC 
GTCCCACTGG ATTCCAGTTT CCGTCAGAGA AGGAACCGCA 
CACCCCAGTC CTGAGCCAGG GGTCTCCTGT CCTGAGGCTC 
GGGTCTGGAG TGGTGGGGGT CAGAGAGAGA GTGGGGGACA 
ATGTCTGAGT TTCTGCGTGG CCACTGTCAG TCTCCTCGCC 
GTACGACACC ATCCCCCAGG ACAGGCTCAC GGAGGTCATC 
TGCGTGCGTC GGTATGCCGT GGTCCAGAAG GCCGCCCATG 
TAAGGTTCAC GTGTGATAGT CGTGTCCAGG ATGTGTGTCT 
GTGTCTGTGA TGCGTTTCTG TGGTGGAGGT ACTTCCATGA 
CGTGTGTGTC GTGGTGC ATG TATCTGTGGC GTGCATATTT 
CCATGGTGTG TGTGCCTGTG GTGTGCATGT GTGTGTGTCT 
CATGTCTGTG ATGTGCCTAT TTGTGGTGTG TGTGTGCATG 
GGTGTGTGTG GCCCCTTGGC CTTACTCCTT CCTCCTCCAG 
CGGGTGCTGG TTTGGGGAGC TCCACATTCA GGGTCCTCAC 
GGGCTGGGCC TTGGAGACTG TAAGCCAGGT TTGAGAGGAG 
CCCTGGCACC CCCAGGACCC CAGTCTGGC Z TATGCCGGCT 
TCGCTCCCCG GGACACACTC CTCCCAGAGC GGCCGGGGGC 
TGGGCTTGGG TTCCCACCCA GTGGTCATGA GCACGCTGGA 
GGGTGCAGAG GTGAAGAAGT ATCCCTGGAG CTTCGGTCTG 
CCTCTTTCTC TGACTTCTTG AGCT 



CGGTGATCTG GATGTGGCAT GTCCTTCTCG 215 60 
GCGTGAGGAG ATCCTGGCCA AGTTCCTGCA 21630 
TTCTTTTATG TCACGGAGAC CACGTTTCAA 21700 
AGTTGCAAAG CATTGGAATC AGGTACTGTA 21770 
CAGCCCGGCC TCAGCATGCG CCTGTCTCCA 2184 0 
GCCAGGGGCC CCGTCACAGG CCTGGTCCAA 21910 
TCTCTTACTT GTAAAATCAG GAGTTTGTGC 21980 
TTAGATGGAA ACACTACCAC TAGCCTCCTT 22050 
TTTTTTCTTT TTTGAGATGG AGTCTCACTC 22120 
CTGCAACCTC CACCTCCTGG GTTTAAGCGA 22190 
ACCTGCCACC ACGCCTGGCT AATTTTTGTA 222 60 
GTCTCGAACT CATGACCTCA GGTGATCCAC 22330 
CACCGTGCCC AGCCCCCGAT TCTCTTTTAA 22400 
GGTCATGAGA GGATAAAATC CCACCCACTT 22 470 
GGGATAGGAG AGTTCCACCA TGAGCTAACT 2254 0 
CAATGTTCGG CTGATGAGAG TGTGAGATTG 22 610 
GCGCTGGTCT GGGAGATGCC AGCCTGGCTG 22 680 
GCTGACTGTG GAGGGCTTTA GTCAGAAGAT 22750 
GGGCCTTGTG ACACCCCATG CCCCAAATCA 22 820 
GTAACACAGC CTCTGGGCTG GGGACCCCGA 22 890 
AGGGCTTTCC CTGTGGGAAC AAGTTAATAC 22 960 
TGCGACCCAA CATGGTCATT TGACCAGTAT 23030 
AGACGTGGTG GTCCCCAAGA TGCTCCTTGT 23100 
CCCCTCCTCC CTGGACAGGG TACCGTGCCT 2 3170 
AGCTCCGGAG CACCCGCGGC CCCAGTGTCC 2 324 0 
GTGTGGCCGC TCCAGCCCCC GTGCCCCCAT 23310 
GAGACTGGTG GGCTCATGAG AGCTGATTCT 2 3 380 
CTCCATCTGA AGGGATGTGG CTCTTTCTAC 23450 
GGCTGTACCA GAGGGACAGG CATCCTGTGT 23520 
ACCAGGCTCC CTGGTGCTGA TGGTGGGACA 23590 
GTTGACTATA GGACCAGGTG TCCAGGTGCC 2 3660 
GGTGGACGTG GCCCCGGGCA TGGCCTTCAG 2 37 30 
TGGGGGCTTG TGGCTTCCCG TGAGCTTCCC 2 3800 
CTCTATTGCA GACAGCACTT GAAGAGGGTG 2 387 0 
GGGAAGCCAG GCCCGCCCTG CTGACGTCCA 2 394 0 
TGTGAACATG GACTACGTCG TGGGAGCCAG 24010 
AACTTCCTTT TTAAACAGAA GTGCGTTTGA 24030 
AGGGGCCACG GGACACAGCC AGGGCCATGG 24150 
GGTGCCGGTG CCTCCAGAAA AGCAGCGTGG 2 4 220 
GACCACAAGA AGCAGCCGGG CCAGGGCCTG 2 4290 
TGGTGCGCAG CCTCCGTGCG CTTCCGCTTA 24 360 
GGGCTCTGAG GATCCTGGAC CTTGCCCCAC 244 30 
GACCCCGTCA TCTGAGGAGA GTGTGGGGTG 2 4 500 
CACATGCGGC CAGGAACCCG TTTCAAACAG 2 4 570 
GGTGGCTGGG GACACTGGGG AGGGGCTGCT 2 4 640 
ATCCACTTTC CTGACTGTCT CCCATGCTGT 2 4710 
TGTTCAGCGT GCTCAACTAC GAGCGGGCGC 24780 
CGATATCCAC AGGGCCTGGC GCACCTTCGT 24 8 50 
TTTGTCAAGG TGGGTGCCGG GGACCCCCGT 24 920 
CACCTCATGT TGGGTGGAGG AGGTACTCCT 2 4 990 
AGGACACACC TGGCACCTAG GGTGGAGGCC 2 50 60 
TGACTGCCCG GGCTCCTATT CCCAAGGAGG 2 5130 
ACGGCTCAGC CACCAGGCCC CGGTGCCTTG 2 5200 
AGAGAGGGGA CACAGCCCGC CCTGCCCTTG 2 52 70 
CCGCCAGGCC AGGCCCTGAG GGCAGAGGTG 2 5340 
TCCACTCACA CAGGTGGATG TGACGGGCGC 2 5410 
GCCAGCATCA TCAAACCCCA GAACACGTAC 2 54 80 
GGCACGTCCG CAAGGCCTTC AAGAGCCACG 2 55 50 
CTGGGATATG AATGTGTCTA GAATGCAGTC 2 5 620 
TTTACACATC TGTGATATGC GTGTGTGGCA 2 5690 
GTGGTGTGTG TGTGTGTGGC ACGTGTGTGT 2 57 60 
GTGACACGTG CATGTTCATG CTGTGTGCTG 2 58 30 
TGTCCGTGAC ATATGCGTGT CTATGGCATG 2 5900 
GCATGGTCCG CACCATTGTC CTCACGCTCT 25970 
TTCTAGCATG GGTGCCCCTG TCCTGTCAC A 2 604 0 
AGTAGGGATG CTGGTGGTAC CTTCCTGGAC 2 6110 
CCATGAGATA TAGGAAGGCT GATTCAGGCC 2 6180 
CTTGGGGCTC GGCAGGGGTG AAAGGGGCCC 2 62 50 
GGGGTAAGCC CTCAAAGTCG TGCCAGGCCG 2 632 0 
GGGAGAGGCA CATGTGGAAA CCCACAAGGA 2 63 90 

25414 
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Contig 2: 



TGTGGGATTG GTTTTCATGT GTGGGATAGG TGGGGATCTG TGGGATTGGT TTTTATGAGT GGGGTAACAC 7 0 

AGAGTTCAAG GCGAGCTTTC TTCCTGTAGT GGGTCTGCAG GTGCTCCAAC AGCTTTATTG AGGAGACCAT 140 

ATCTTCCTTT GAACTATGGT CGGGTTTATA GTAAGTCAGG GGTGTGGAGG CCTCCCCTGG GCTCCCTGTT 210 

CTGTTTCTTC CACTCTGGGG TCGTGTGGTG CCTGCTGTGG TGTGTGGCCG GTGGGCAGGG CTTCCAGGCC 280 

TCCTTGTGTT CATTGGCCTG GATGTGGCCC TGGCTACGCT CCGTCCTTGG AATTCCCCTG CGAGTTGGAG 350 

GCTTTCTTTC TTTCTTTTTT TCTTTCTTTT TTTTTTTTTT TGATAACAGA GTCTCGCTCT TTTTTGCCCA 4 20 

GGCTGGAGTG GTTTGGCGTG ATCTTGGCTC ACTGCAACCT GTGCTTCCTG AGTTCAAGCA ATTCTCTTGC 4 90 

CTCAGCCTCC CAAGTAGCTG GAATTATAGG CGCCCACCAC CATGCTGACT AATTTTTGTA ATTTTAGTAG 5 60 

AGACGAGGTT TCTCCATGTT GGCCAGGCTG GTCTCGAACT CCTGACCTCA GGTGATCCTC CCACCTCGGC 630 

CTCCCAAAGT GCTGGGATGA CAGGTGTGAA CCGCCGCGCC CGGCCGAGAC TCGCTTCCTG CAGCTTCCGT 7 00 

GAGATCTGCA GCGATAGCTG CCTGCAGCCT TGGTGCTGAC AACCTCCGTT TTCCTTCTCC AGGTCTCGCT 770 

AGGGGTCTTT CCATTTCATG ACTCTCTTCA CAGAAGAGTT TCACGTGTGC TGATTTCCCG GCTGTTTCCT 84 0 

GCGTAATTGG TGTCTGCTGT TTATCGATGG CCTCCTTCCA TTTCCTTTAG GCTTTGTTTA TTGTTGTTTT 910 

TCCGGCTCCT TGAAGGAAAA GTTTCGATTA TGGATGTTTG AACTTTCTTT TCTAAACAAG CATCTGAAGT 980 

TGCCGTTTTC CCTCTAAAGC AGGGATCCCG AGGCCCCTGG CTGTGGAGTG GCACCGGTCT GGGGCCTGTT 1050 

AGGAACCCGG CGCACAGCGG GAGGCTAGGT GGGGTGTGGG GAGCCAGCGT TCCCGCCTGA ■ GCCCCGCCCC 1120 

TCTCAGATCA GCAGTGGCAT GCGGTGCTCA GAGGCGCACA CACCCTACTG AGAACTGTGC GTGAGAGGGG 1190 

TCTAGATTCT GTGCTCCTTA TGGGAATCTA ATGCCTGATG ATCTGAGGTG GAACCGTTTG CTCCCAAAAC 1260 

CATCCCCTTC CCCACTGCTG TCCTGTGGAA AAATCGTCTT CCACGAAACC AGTCCCTGGT ACCACAATGG 1330 

TTGGGGACCC TGTGCTAAAG ACCTGCTTCA GCAGCCTCTC GTCAGTGTTG ATATATTGGC TTTTCTGTGT 14 00 

TGAGTCCAGA ATAATTACGG ATTTCTGTGA. TGCTTTCCGC CGACCTCAGA CCCATGGGCT ATTTGTGGGC 1470 

GTGTTGCCTG CTCCTGGGTT GGGAAGGGTG CAGGCCCCAT GTACCTTCCT GTTACTGCCT TCCAGGTTGG 154 0 

TTCTCAGGGT TGAATCGTAC TCGATGTGGT TTTAGCCCAC GGCCCTGCCG CCAGCTCCTG GGGGCTGGGG 1610 

AACATGCTGA AGCACAGAGT CACCGTGCGC GTCTTTTGAT GCCTCACAAG CTCGAGGCCT CCTGTGTCCG 1680 

TGTTAGTGTG TGTCACGTGC CTGCTCACAT CCTGTCTTGG GGACGCAGGG GCTTAGCAGG TCCCGTAGTA 17 50 

AATGACAAGC GTCCTGGGGG AGTCTGCAGA ATAGGAGGTG GGGGTGCCGG TCTCTCTCCC GCGTCTTCAG 182 0 

ACTCTTCTCC TGCCTGTGCT GTGGCTGCAC CTGCATCCCT GCAATCCCTC CAGCACTGGG CTGGAGAGGC 18 90 

CCGGGAGCTC GAGTGCCACT TGTGCCACGT GACTGTGGAT GGCAGTCGGT CACGGGGGTC TGATGTGTGG 19 60 

TGACTGTGGA TGGCGGTTGG TCACAGGGGT CTGATGTGTG GTGACTGTGG ATGGCGGTCG TGGGGTCTGA 20 30 

TGTGGTGACT GTGGATGGCG GTCGTGGGGT CTGATGTGTG GTGACTGTGG ATGGCGGTCG TGGGGTCTGA 2100 

TGTGGTGACT GTGGATGGCG GTCGTGGGGT CTGATGTGGT GACTGTGGAT GGCGGTCGTG GGGTCTGATG 2170 

TGGTGACTGT GGATGGCAGT CGTGGGGTCT GATGTGTGGT GACTGTGGAT GGCGGTCGTG GGGTCTGATG 224 0 

TGGTGACTGT GGATGGCAGT CGTGGGGTCT GATGTGTGGT GACTGTGGAT GGCGGTCGTG GGGTCTGATG 2 310 

TGTGGTGACT GTGGATGGCG GTCGTGGGGT CTGATGTGTG GTGACTGTGG ATGGCGGTCG TGGGGTCTGA 2380 

TGTGTGGTGA CTGTGGATGG CGGTCGTGGG GTCTGATGTG GTGACTGTGG ATGGCGGTCG TGGGGTCTGA 24 50 

TGTGTGGTGA CTGTGGATGG TGATCGGTCA CAGGGGTCTG ATGTGTGGTG ACTGTGGATG GCGGTCGTGG 2 520 

GGTCTGATGT GTGGTGACTG TGGATGGTGA TCGGTCACAG GGGTCTGATG TGTGGTGACT GTGGATGGCG 2590 

GTCGTGGGGT CTGATGTGTG GTGACTGTGG ATGGCGGTTG GTCCCGGGGG TCTGATGTGT GGTGACTGTG 2660 

GATGGCGATC GGTCACAGGG GTCTGATGTG TGGTGACTGT GGATGGCGGT CGTGGGGTCT GATGTGTGGT 27 30 

GACTGTGGAT GGCGGTCGTG GGGTCTGATG TGTGGTGACT GTGGATGGCG GTCGTGGGGT CTGATGTGGT 2800 

GACTGTGGAT GGCGGTCGTG GGGTCTGATG TGGTGACTGT GGATGGCGGT CGTGGGGTCT GATGTGTGGT 2870 

GACTGTGGAT GGCGGTTGGT CCCGGGGGTC TGATGTGTGG TGACTGTGGA TGGCGGTCGT GGGGTCTGAT 2940 

GTGGTGACTG TGGATGGCAG TCGTGGGGTC TGATGTGTGG TGACTGTGGA TGGCGGTCGT GGGGTCTGAT 3010 

GTGTGGTGAC TGTGGATGGC GGTCGTGGGG TCTGATGTGT GGTGACTGTG GATGGCGGTC GTGGGGTCTG 3080 

ATGTGTGGTG ACTGTGGATG GCGGTCGTGG GGTCTGATGT GGTGACTGTG GATGGCGGTC GTGGGGTCTG 3150 

ATGTGTGGTG ACTGTGGATG GTGATCGGTC ACAGGGGTCT GATGTGTGGT GACTGTGGAT GGCGGTCGTG 3220 

GGGTCTGATG TGTGGTGACT GTGGATGGCG GTCGTGGGGT CTGATGTGGT GACTGTGGAT GGCGGTCGTG 3290 

GGGTCTGATG TGTGGTGACT GTGGATGGCG GTCGTAGGGT CTGATGTGTG GTGACTGTGG ATGGCAGTCG 3360 

GTCACAGGGG TCTGATGTGT GGTGACTGTG GATGGCGGTC GTGGGGTCTG ATGTGTGGTG ACTGTGGATG 34 30 

GCGGTCGTGG GGTCTGATGT GTGGTGACTG TGGATGGCGG TCGTGGGGTC TGATGTGTGG TGACTGTGGA 3500 

TGGCGGTCGT GGGGTCTGAT GTGGTGACTG TGGATGGTGA TCGGTCACAG GGGTCTGATG TGTGGTAGCT 3 570 

GCAGGTGGAG TCCCAGGTGT GTCTGTAGCT ACTTTGCGTC CTCGGCCCCC CGGCCCCCGT TTCCCAAACA 3 64 0 

GAAGCTTCCC AGGCGCTCTC TGGGCTTCAT CCCGCCATCG GGCTTGGCCG CAGGTCCACA CGTCCTGATC 3710 

GGAAGAAACA AGTGCCCAGC TCTGGCCGGG GCAGGCCACA TTTGTGGCTC ATGCCCTCTC CTCTGCCGGC 37 80 

AGGTCTCTAC CTTGACAGAC CTCCAGCCGT ACATGCGACA GTTCGTGGCT CACCTGCAGG AGACCAGCCC 38 50 

GCTGAGGGAT GCCGTCGTCA TCGAGCAGGT CTGGGCACTG CCCTGCAGGG TTGGGCACGG ACTCCCAGCA 3 920 

GTGGGTCCTC CCCTGGGCAA TCACTGGGCT CATGACCGGA CAGACTGTTG GCCCTGGGGG GCAGTGGGGG 3990 

GAATGAGCTG TGATGGGGGC ATGATGAGCT GTGTGCCTTG GCGAAATCTG AGCTGGGCCA TGCCAGGCTG 4 0 60 

CGACAGCTGC TGCATTCAGG CACCTGCTCA CGTTTGACTG CGCGGCCTCT CTCCAGTTCC GCAGTGCCTT 4130 

TGTTCATGAT TTGCTAAATG TCTTCTCTGC CAGTTTTGAT CTTGAGGCCA AAGGAAAGGT GTCCCCCTCC 4200 

TTTAGGAGGG CAGGCCATGT TTGAGCCGTG TCCTGCCCAG CTGGCCCCTC AGTGCTGGGT CTGAGGCCAA 4270 

AGGAAACGTG TCCCCCTTCT TAGGAGGACG GGCCGTGTTT GAGCCACGCC CCGCTGAGCG GGCCTCTCAG 4 34 0 

TGCTGGGTCT GTCCACGTGG CCCTGTGGCC CTTTGCAGAT GTGGTCTGTC CACGTGGCCC TGTGGCTCTT 44 10 

TGCAGATGCC TGTTAGCACT TGCTCGGCTC TAGGGGACAG TCGTGTCCAC CGCATGAGGC TCAGAGACCT 4 4 80 

CTGGGCGAAT TTCCTTGGCT CCCAGGGTGG GGGTGGAGGT GGCCTGGGCT GCTGGGACCC AGACCCTGTG 4 550 

CCCGGCAGCT GGGCAGCAAC TCCTGGATCA CATATGCCAT CCGGGCCACG GTGGGCTGTG TGGGTGTGAG 4 620 

CCCAGCTGGA CCCACAGGTG GCCCAGAGGA GACGTTCTGT GTCACACACT CTGCCTAAGC CCATGTGTGT 4 690 

CTGCAGAGAC TCGGCCCGGC CAGCCCACGA TGGCCCTGCA TTCCAGCCCA GCCCCGCACT TCATCACAAA 47 60 

CACTGACCCC AAAAGGGACG GAGGGTCTTG GCCACGTGGT CCTGCCTGTC TCAGCACCCA CCGGCTCACT 4 8 30 

CCC ATGTGTC TCCCGTCTGC TTTCGCAGAG CTCCTCCCTG AATGAGGCCA GCAGTGGCCT CTTCGACGTC 4 900 

TTCCTACGCT TCATGTGCCA CCACGCCGTG CGCATCAGGG GCAAGTGAGT CAGGTGGCCA GGTGCCATTG 4 970 

CCCTGCGGGT GGCTGGGCGG GCTGGCAGGG CTTCTGCTCA CCTCTCTCCT GCCCCTTCCC CACTGNCCTT 5040 

CTGCCCGGGG CCACCAGAGT CTCCTTTTCT GGCCCCCGCC CCCTCCGGCT CCTGGGCTGC AGGCTCCCGA 5110 

GGCCCCGGAA AC ATGGCTCG GCTTGCGGCA GCCGGAGCGG AGCAGGTGCC ACACGAGGCC TGGAAATGGC 5180 

AAGCGGGGTG TGGAGTTGCT CCTGCGTGGA GGACGAGGGG CGGGGGGTGT GTCTGGGTCA GGTGTGCGCC 52 50 
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GAGCGTTTGA 
GTATCTCTCT 
GAGGGCGCGG 
CTCCTCTCGG 
GTGAGCCACA 
ACCACACACC 
AATTCGTGCA 
GATCGTCTCC 
ATTGCATTAT 
AGTAGTACAC 
TGTGCACATG 
ACCACGGGCC 
GGGCTTTGGG 
CTGTCCCTAG 
ACGTCTTCAA 
CGTATCTGCT 
AGTGCAGTCC 
TTTTTTTTTT 
TGCAACCTCC 
CCCACCCCCT 
GTCTCGAACT 
CCATCACGCC 
TGTTTTTGCA 
AGGGTCGCGT 
TGTCTGAAAA 
GGGCAAGGTT 
CCTTGAAAAA 
TCTAAGATTT 
GCTGGATCTG 
ATCTGGTATG 
TCTGTGTGTG 
CCGCCCCAGG 
CTGTGCTACG 
CCAGGGGGGC 
ACCATGACTG 
GGGTTGCACA 
GCTCAGCAGG 
CTTCTGTCAC 
ATCTCCCAGC 
ATGGGGTCTA 
TTTTATTGAC 
GTTAGTTACA 
TATCTCCTAA 
CTGTGTCCAA 
CT TGCAATAG 
CTTTTTTATG 
GGACATTTGG 
CTTTATAGCA 
TTCTAGTTCT 
CAACAGTGTA 
ACTGAACAAG 
GACTCCGGGT 
TCTTCTTCAT 
CACAACTTGC 
GGATACCTCT 
CTTCCTGTGG 
ACGTGGCTCA 
TCTGAGAAGT 
GTAAACACTG 
AAAGAAATTT 
GACATTCAGT 
ATTTTAGGCT 
CTTCCTCAGG 
CCCCGTGTCC 
GCAAGAGCAG 
ACCGTGCAGG 
CTCCTGCTGC 
ACAGGGCCTG 
GCTGAGTGAG 
GAATTTGGAT 
TTTGTGAATC 
CCTCAAGGGC 
AAAGCTGTAA 
TTGTGAAAAC 
TGGGGGTATG 
AGATTCACTC 
AGAGCAGACG 
TGTATGTTGG 



GCCTGCAGCT 
CTCCCGATAC 
GCTCTTCTCT 
GGGGCCTGTG 
CTCACGGTGG 
TCCCGGCAGG 
CACTCAAGGT 
AGCGGATAAA 
AAGTAATCAC 
ACGTTCTGGA 
TGTGTAAGCG 
TCCTTCGTGG 
GAATGTGAGG 
GACACGGACA 
AACCTGTTGC 
TGCGTTGACT 
TCTTGCCCAT 
GAGACGGAAC 
GCCTCCCGGG 
GCGCCTGGCT 
CCTGACCTCA 
CAGCCGGAAA 
GTAGGCTGCA 
GGCAGCCATG 
CGCACCCTTG 
GTATCCGTTG 
AAAAAAAGGA 
AAGAAACCTT 
TTTCAGCCGC 
TTTCTGAGGT 
GCTCGGTTTG 
TCCTACGTCC 
GCGACATGGA 
TTGGGTGGGG 
CTCTGTCTTG 
GCCTGAGGAC 
CGGGAGGGCC 
GTCACCCAGG 
AGGCCCTCGA 
CACCCAAGGA 
AGCAGTTACT 
TATGTATACA 
TGCTATCCCT 
GTGTTCTCAT 
TTTGCTCAGA 
ACTGCATAGT 
GTTGGTTGCA 
GCATGATTTA 
AGATCCTTGA 
AAAGTGTTCT 
CAGACAGTTA 
TTTTCCTGTG 
TAAGGTTCAA 
TCTGGGATTT 
GGCCCATGGT 
ATAGGATCTG 
GAGGGGGCGA 
GGGGCCGCGC 
AGTACTTATA 
AAGTTTTTCA 
CCCTCGTAGA 
GCTCCTGCGT 
TGAGGCCCGT 
TGCCCCTGGC 
AGGCGCTTGG 
CCCTGGTCCT 
TCTTTGGAAA 
C AGGGCCGAG 
CTGGCCCACA 
TTGCTGAGTG 
AAAC T AAA A T 
GCCCCACAGA 
AGGGAACCCT 
CCATTTGGAC 
CCTGGCGTTC 
GGGGGGAGCC 
GGGAACGCTG 
GGTCCCACCG 



TGTCAGCTCC 
AAAAGGATTT 
CTGTGACTAG 
GTGGCCATGG 
TAGAGCCACA 
CATCTGCCTG 
CATCAGCAAG 
GGACTGTGCA 
TAATGGTATC 
AAAACACAAA 
GCCCCCAGGC 
TCGTGAATTT 
TGATGACTGC 
GGCCCGAAGC 
CCCAAAAACT 
CGCTGGGCTG 
CACTGTGATA 
GTCACTGTTG 
TTCCAGCATT 
AATTTTTGTA 
GGTGATCCAC 
GCCTCTTTTT 
AGCGTCTCTT 
CCTTCTGTGT 
GCATCCTTGT 
GCGCGCAGCG 
GTCCGGTTAA 
AATGAAAGAA 
CCCAGTGCAT 
GTTTGCCGGC 
AGTGTACGCA 
AGTGCCAGGG 
GAACAAGCTG 
GTTGATTTGC 
AGGAACCAGA 
TGCGGGCTCC 
GCTGCCCTGC 
TTCCGTTAGG 
CAGGTGGCCT 
CGCACACACC 

TGTGCCATGT 
CCCCACTCCC 
TGTTCAGTTC 
GTGATGGTTT 
ATTCCGTGGT 
AGTCTTTGCT 
TAATCCTTTG 
GGAATCACCA 
GGTGCTGGAG 
GTGAAGGATG 
CATCTTTTGA 
GTTCTAGATT 
GGAGGAAAGT 
CATGGGGCGC 
GGTCTCGGAT 
GGTTCCCAGC 
CTGATGGCCT 
ATGAATGAGG 
TTTAACCGCT 
CAGATACTAC 
TTGGTGGATG 
GCCGTGTGTC 
ACCGCAGCGT 
CCGTGCACCC 
GCAGAGACGC 
GTCAAGAGTG 
GCGG^rtGCCT 
GCG77CGCTG 
CTGC7G7CTT 
CAGGCACAGG 
GCCGGTGGGC 
CAGAAAATGT 
CCGCCCTCCA 
CTTGTGCCGC 
CAGGTC CC AA 
CTTCTGTGTG 
GGGGCAGAAC 



AAGTTACTAC 

TATCCGATTC 

ATTTCCCATC 

GGCAGGCGGC 

GTGCCTGGTG 

CGACCCTGTG 

GTCATCCGCA 

CAGCTTCGGA 

AGCAATTATA 

TTGCACATGG 

CCACAGAATT 

TATTAAGATG 

GTCCTCATGC 

TCTAGTCCCC 

AAGAACAGAG 

GCCGGACTCC 

TCTGCACCAG 

TCTGCCTGGG 

TCTCCTGCCT 

TTTTTAGTAG 

CCACCTCGGC 

AAGGTGACCA 

AGCAACAGGA 

GCACCTTTAG 

TTGGAGAGTT 

GCTACATGTA 

GCATTCATTC 

AACCTTGATG 

GGTGAGAGTG 

TGAATGGTAG 

TGTCCAGCAC 

GATCCCGCAG 

TTTGCGGGGA 

TTTTGATGCA 

CAAGGTTGCA 

ACGCAGGCTC 

ATGATGAGCA 

GTCCTTGGGG 

GGACTGGGCG 

TAAATATCGT 

TAATACTTTA 

TGGTGTGCTG 

CCCATCCCAT 

CCACCTGTGA 

CCAGCTTCGT 

GTATATGTGC 

ACTGTGAATA 

GGTATATACC 

CACTGTCTTC 

AGGATGTGGA 

CGTCAGGAAG 

AACTCTAGCT 

GAAATAAGTT 

GTCCTCGAGC 

TGGGCTTGGG 

CATGCTGAGG 

CCCAGCTTTC 

TCGTTCGTCT 

AATTGCTGTA 

TTGGAGAATG 

GTAAAAAGTG 

A.TTTC7TGTT 

TGTGGGGACC 

TGTCTCTGCC 

AGGCC7GGGG 

ACCCAGG77A 

GCGGC7CC7G 

CCTCCCCAGG 

CGGTCACGTT 

G AA C C r\ C G G 

GGACv-7GGCC 

77G7777AAA 

GGCCGCCAGG 

AGTCCACCCT 

AGCCCGGAGC 

GCAAC7GAGG 

GC AA G 7 7 C C T 

7C7G7C7C7G 



TGACGCTGGA 
TCATTCCTGT 
TGGAAAGTGC 
CTGGGAGAGC 
CCACATCACG 
TGTGCCTGGG 
GTCAGGTGGA 
AGCTTTTATT 
ATATTTATTA 
CAGCAGAGTG 
CGCTGACAAA 
GATCAAGTCA 
CCTGACAGAC 
ATCGTGGTCC 
AGAGTTTCCC 
TAGAGTTGGT 
CAAGGAAAGC 
CTTGAGTGCA 
CAGCCTCCCG 
AGAGGGGTTT 
CTCCCAAAGT 
CCTATAGCGC 
GTGGCGTCCT 
GTTCCACGGG 
TCTGCTTCTC 
GGGTCATGAG 
CGGGTCAAGT 
ATTCAGAGCA 
GGGAGCAGGG 
ACGTGTGGTT 
ATGCCCTGCC 
GGCTCCATCC 
TTCGGCGGGA 
TTCAGTGTTA 
GCCCCTTCTT 
TGTCCAGCGG 
TGTGAATTCA 
AGATGGGGCT 
CCTCTTCAGC 
GCCAACCTAA 
AGTTCTAGGG 
CACCCATTAA 
GACAGGCCCT 
GTGAGAACAT 
CCATGTCCCT 
CACATTTTCT 
GTGCCGCAAT 
CAGTAATGGG 
CACAATGGTT 
CAGCAGTTAT 
CCTGCAGGCC 
CCAATTATAG 
TATGTAACAG 
TGGCGGCACA 
CCTGAGGGTC 
ACCACAGCTG 
TTACCGTCTT 
TCAGCTGGCA 
GCAG7TAACT 
TTAC7TTATT 
7AAAGTTAAC 
GGTGACACCT 
7CCACAGCCT 
AAGTCCTCTC 
GCGCAGGGGC 
CACACGTGGT 
GGGCCCCAGT 
GTGCACCTGA 
CCTGCGTGGG 
GATGGCTAGG 
7CAGCACAGG 
G7GCGATTTG 
GGTGG7TTCA 
CCAGG7CCAC 
ACAGCAGGCT 
GCTCAGGAGT 
GAGGGTGCTG 
ATGA37CGGC 



CACCCGGCTC 
CCCTGTCGTG 
GGGGTTGACC 
TGCCGTCACA 
TCCTCTGGAT 
GAGAGTGGTA 
ACGTGGAGGC 
TAAAAATATA 
AAGTATAATT 
AATTTTGGCC 
GTCACCTCCC 
CGTACCGTCC 
AGGAGGTGAC 
AGTTTGGCCT 
ATCCCATGTG 
GCGTGTGCTT 
CTCTTTTCTT 
GTGGCGCGAT 
AGCAGCTGAG 
TTGCCATGTT 
GCTGGGATTA 
TTCCCGAAAA 
GTGGGCTCTG 
GCTATTCTGC 
GTTGGTCATG 
TCTTTCACCG 
GTCTGGTTCT 
AGGATGTGGT 
ATTGTTTGTT 
TGTGTGTATG 
CGTCTCTCAC 
TCTCCACGCT 
CGGGTGAGGC 
ATATTCCTGG 
GGTATGAAGC 
CCATGTCCAG 
ACACCGAGGA 
GGTGCAGCCT 
CCATTGCCCA 
TGTGGTTCAA 
TACATGTGCA 
CTCATCATTT 
GGTGTGTGAT 
GTGGTGTTTG 
ACAAAGGACA 
TAATCCAGTC 
AAACATACGT 
ATGGCTGGGT 
GAACTAGTTT 
TTTTTTATGA 
ACACAGCCAT 
CATGTACAGT 
AAACAAAAAT 
CTGGTCAGCC 
ACACAGTGCA 
CCATGCTGGT 
CAGTTATTTT 
CAGAATTGCA 
GTAGAGAGCT 
TATGGCTGTG 
CTTGCTGTGT 
CACCTCACCC 
GTGGGC7TTG 
TCTCTGCCGG 
ACCTTCGGGA 
GAGTGCAGGC 
GAGACCCCCA 
GCCTGCGGAG 
GTTGTT7GGG 
AGTGGGTTTC 
GGATTG7CCA 
ACGAGGGACG 
GGTGCT7TGC 
CCTCCAGGGC 
G7GCACATT7 
CCTGAGGCTG 
GCCAGGGAGG 
AGCCA7GTAA 



TCACACGCTT 
TGACCCCCGC 
GTGTAGTTTG 
CAGCCACTGG 
TTTAAGTAAA 
GCACGGAGGA 
CTCTCTCTGG 
ACTATTAATT 
AGAAATATTA 
GAGGGACACG 
CAGAGAAGCC 
ACGTGTGGCA 
TGTGTCTGTC 
CTGAATAAAA 
CTCACAGGGG 
CTGTGCAAAA 
TTCTTTCTTT 
CTCAACTCAC 
ATTACAGGCA 
GGCCAGGCTG 
CAGGTGTGAG 
TAACAGGTCT 
GGGATGGCTG 
TCTCACTGTT 
CTGAAACTAG 
TGGACAAATT 
GTGAATAAAC 
CACACCTGTG 
CAGAGGTCTC 
AGGTTCTGTG 
CTGTGTCTTC 
GCTCTGCAGC 
CTCCTCTTCC 
TGCTCTGGAG 
CGCACGGGAG 
AGGCCTCAGG 
AGCACACCAG 
GAGGCCCCAC 
TCCCACTTGC 
CTCAGCTGGC 
CGACGTGCAG 
ACATTAGGTA 
GTTCCCCACC 
GTTTTCTTTC 
TGAACTCATC 
TATCATCGAT 
GTGCATGTGT 
CAAATGGTAT 
ACACTCCCAC 
AAATAGTATC 
TTCTCTCGAA 
GGATCAAGGT 
TTCTTGTACA 
CTCTGGGACA 
CCATGCCCAG 
AAAGGGCACC 
TCCCTAAGAG 
CAAGCTGATG 
CGTCTGTTGG 
TAAATTGTTT 
ATTTTCCCTT 
ACGCGAAAAC 
CAGTTGAGCC 
TGCTGGATCC 
GGGAGTGGGT 
GGTGACCTGG 
GGAGCTGTGC 
AGCAGGAGCT 
ATCGGTGGGA 
AGAGTTGATT 
ATGTGGTCCC 
AGAAACC7TG 
TGGGCTG7GT 
CGCCCTGGGC 
AAATCCACTA 
CTGAGGGGAC 
TGGCTCAGAG 
CAGGAAGGGG 



5320 

5390 

5460 

5530 

5600 

5670 

5740 

5810 

5880 

5950 

6020 

6090 

6160 

6230 

6300 

6370 

6440 

6510 

6580 

6650 

6720 

6790 

6860 

6930 

7000 

7070 

7140 

7210 

7280 

7350 

7420 

7490 

7560 

7630 

7700 

7770 

7840 

7910 

7980 

8050 

8120 

8190 

8260 

8330 

8400 

8470 

8540 

8610 

8680 

8750 

8820 

8890 

8960 

9030 

9100 

9170 

9240 

9310 

9330 

9450 

9520 

9590 

9650 

9730 

9800 

9870 

9940 

10010 

10080 

10150 

10220 

10290 

10360 

10430 

10500 

10570 

10640 

10710 
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TGGCCACAGG GAGCTGGGAA TGCACCAGGG GAGCTGCGCA 

GAAGGGCAGG GGGACGCCCG GGGCCACAGC AGAGGCCGCA 

GAGGCTACCG GGCACAGGGG GGCTCCCTGA GCTGGGTGAG 

TTGACGTGAA GCTGACGACT GGTGTTGCCC AGCTCACAGC 

CAGAACCCTC CCCTTTGTCT AAAGCACAGC AGATGCCTTC 

TTGAGAAACG TCTTAAAAGA AGGTGGGATG GTGGCAATTT 

CAGATGAGTC TATAACGGGA TTGTGGTGTT GCCATGGGGA 

GGGGCTGCAC CTCCCATCTG AGTCCTGGCT GTCCCGGGTC 

GTCCTGCCCG GGAGACAGGG AAAGCACCCC GAAGTCTGGA 

CTGCCAGGCC CAGCACCCTG CTCCAAATCA CCACTTCTCT 

CAGGTTACCT CCTGGGTGAC GGCCCCGCAT CCTGGGGCTG 

CGAGGTGTCC CTGAGTATGG CTGCGTGGTG AACTTGCGGA 

AGGCCCTGGG TGGCACGGCT TTTGTTCAGA TGCCGGCCCA 

GGATACCCGG ACCCTGGAGG TGCAGAGCGA CTACTCCAGG 

GCCCGGCTGG GGCAGGTGCT GCTGCAGGGC CGTTGCGTCC 

CCAATCCCAA AGGGTCAGAG GCCACAGGGT GCCCCTCGTC 

CTGTGGGAGT GAGGGTGCTC ACAACGGGAG CAGTTTTCTG 

AGACCTGGGT GCACTGAGGT GTCTTCAGAA AGCAGTCTGG 

GGCGTGAGTC TCTCAAACCC GAACACAGGG GCCCTGCTGG 

GGCCCTGCTG GGCGTGAGTC TCTCCGAACC CAGAGACTTC 

GTGAGCCCCA CACTCCAAGG CTCATCCACA GTCTACAGGA 

TCAGGGGACA GGGCCATGGT GTGGGGGGGG TCTCTACAAA 

GAGCTCAAGG CCCCGTCTCA GGCTCAGACA CAAATGAATT 

TGTTTCTTTT ATGAATAAAA AGTATCAACA" TTCCAGGCAG 

ACTTTGGGAG GCCGAGGTGG GTGGATCACT TGAGGCCAGG 

AATTCCATTT CTACTTAAAA AATACAAAAA TTAGCCTGGC 

GCGGGAGGCT GAGGCAGGAG AATCATTTGA ACCCAGGAGG 

CTGCACTCCA GCCTGGGCAA CAGAGTGAGA CTTCATCTTA 

AACCATAGTG GACAGGTGTT TTTTTATTCT GTCCTTCGAT 

AACTGGGGGT GCCTTCCTCT GAAAGGCACA CCTTCATGGG 

CCAGAGGTTT AAACTGGGGT CCTGTCGTTC TGAGTTAACA 

ATGCTCCCTG GGGTTTGCTT CATGGGGGAG CAGCAGGTGT 

GCAGACGCCC TCATGATGGG GGAGTGGCAG GTGCAGACAC 

TGCAGCTCCC TCCCCACAAG GATGCCGGTC TCCTGTGCTC 

TACCTGGTCC TGGCCTCCAC TGGCTTTGTC TGCATGATTT 

GCCTCTCCCA GGCACCTCTG CAGTGCTGGC CATACCAGTC 

CCCCATGAAA TGTATTTTTT AGGACAGGCA CCCCTGGTTC 

TTGAAGGACA AAGGACAGAC AAACAAATCA GGAAAATGGG 

GGCTAGTGCA GGATGGGTGG GCATCAGGTC ATCAGATGTG 

AAGGCCACTT GGTCAGAGTG TGTGCTTGCA GAGGTGGCTC 

CCATACTCAG GGTGAACTCA CATCCTCTGT GTCTGAAGTA 

AAGAAAACAG GCAAAATGAT TAAGAAAAGT GAAAAAGGAA 

ATTTTAGTCT CCCAAACCAC AGCTCAGATG GTAGAATGTG 

AAACGGAAGC CCTATCTCTC AGAAACGTGT GTTAATGTGG 

TGTGTGTAAT TTTTTTTTCT GAGAAAACTG ACTGGAAGCA 

AGCAGATTCT AGGTAGAAGA GGAGACACAT GCAAACAACA 

AGGGAAGGGA GGTGAACGTT CCCTGGTTTG GTGTTGGGGA 

GAGGCAACGG GCATTGCTTT CACTGCAGAG AAACTCAGCT 

CTGGAGCGTT TGTGCACGTG ATTTATTTAA GGCGCCCTGT 

TTCTCCTAAC CACCTGAGAG GTAGAGGAGG AAAGGCTCCA 

GCAAAGGGCA TGCATGATTG CAGCCTGGCC TCCTGCTCCG 

AAGTCAGACC CATAGGCTCA GGGTGAGCCG GAGCCCAAGG 

TGGACGTCTG ATGCACACTT GGGAAGGTCC TACCAGCAGC 

AGACCCATCC CTCAAAGAAA CGCACGTGAA ACTGATGGCG 

TTTTCTGGGC TTGCCAAGAG CCAGCATCAG GTTGAGGCAA 

TTTGCATGGA AGTCCTCACA ATGTCCTGTG TCTTCCCAGT 

CACGGGTCTT ATTTACCATT TCCAGTGTTC CAGGCAGGGG 

AAATACAGGG CTAAGGAGAT ATTATGCATC ACAAAACTTG 

TTGAAGAATG TTTAATGGCA CAAAACGTTT ATTTCAATGT 

ACACCCCAGG AGCCTGCCGT GAATGTCATG TGTGTTCATC 

GGTGGTGAGG CCCTGGAGGA CATCGGTGGG ATGCCTCCAT 

CGTGCACTCA CTGGAGCCCT GTTTAGCTGG TGCCACCTGG 

AGATTCCCCA CGCCCAACTC AGTGTTCTCC CACAAAAAAC 

GCCCGGGAGC CAGGGCTCCA CAGTTTATTA TGTGTTTTTG 

GATGATGAGT GCACAAACAC GGCCGTGCGA GGTTTGGATA 

AGTTTGGTCA TGCAGAGTCT GGATGGCATG TAGCATTTGG 

GGCTGCAGCG CATGCCCCAG GCAGGACAAG GAAGCGGGAG 

GCAGGAGGGG GCTGGGTGTG GGGCAGGCAC CTGTGTCTGA 

ACCTCCATCA GAGCCAGTCT CACCTTCAAC CGCGGCTTCA 

TTGGGGTCTT GCGGCTGAAG TGTCACAGCC TGTTTCTGGA 

AGAGTTCAGA GTTCAGGAGG TGTGTGCGCA AGTATGTGTG 

ATGGTGACTG GCTGCACGTA AGAGTGCACA TGTACGCATA 

TGTACATGAA GGCATGGCAG TGTGTGCACA GGTGTGCAAG 

CCTGACATGC ATGTGTGTTC GTGCACAGTC GTGTGGGCAT 

GTGTGAGTAG CATGTGTGCA CATAACATGT ATTGAGGGGT 

CCAGTGCCAC TCCTTACAGG ATGAGACGGG GTCCCAGGCC 

CTGAGGGCAT TGTCCCATCT GGGCATCCGC GTCCACTCCC 

TCTCCTGTGG GCATTTACAT CCACTCCACT CCCTCTCTCC 



GCTGGCCGAG GTCCCAGGGC CAGGCCACAG 10780 

GGAAGGGAAG GGGATGCCCA GGCCAGAGCA 10850 

CGAGGCTCAT GACTCGGCGA GGGAACCTCC 10920 

CCAGCCAGGT CCCGCGCCTG AGCAGGAACT 10990 

AGGGCATCTA GGAGAAAACA GGCAAAGTCG 11060 

CTTGTCCAGA TTTTAGTCTG CCCCGGACCA 11130 

CACATGAGAT GGACCATCAC AGAGGCCACT 11200 

CAGGCCAGGT TCTTGCATGC TCACCTACCT 1127 0 

GCAGGGCTGG GTCCAGGCTC CTCAGAGCTC 11340 

GGGGTTTTCC AAAGCATTTA ACAAGGGTGT 11410 

ACATTGCCCC TCTGCCTTAG GACCCTGGTC 114 80 

AGACAGTGGT GAACTTCCCT GTAGAAGACG 11550 

CGGCCTATTC CCCTGGTGCG GCCTGCTGCT 11620 

TGAGCGCACC TGGCCGGAAG TGGAGCCTGT 11690 

ACCTCTGCTT CCGTGTGGGG CAGGCGACTG 117 60 

CCATCTGGGG CTGAGCAGAA ATGCATCTTT 118 30 

TGCTATTTTG GTAAAAGGAA ATGGTGCACC 11900 

ATCCGAACCC AAGACGCCCG GGCCCTGCTG 11970 

GCATGAGTCC CTCTGAACCC GAGACCCTGG 1204 0 

AGGGCCCTTT TGGGCGTGAG TCTCTCCGCT 12110 

TGCCATGAGT TCATGATCAC GTGTGACCCA 12180 

ATTCTGGGGT CTTGTTTCCC CAGAGCCCGA 12250 

GAAGATGGAC ACAGATGCAG AAATCTGTGC 12320 

GGCAAGGTGG CTCACACCTA TAATCCCAGC 12390 

AGTTTGAGGC CAACCTAACC AACATAGTGA 12 4 60 

CTGGTGGCAC ACGCCTGTAG TCCCCGCTAT 12 530 

CAGAGGTTGC AGTGAGCCGA GATCACACCA 12 600 

AAAAAAAAAA AAAAAGTATC AGCATTCCAA 12 67 0 

AATATTTACT GGTGCTGTGC TAGAGGCCGG 127 4 0 

AAGAGAAATA AGTGGTGAAT GGTTGTTAAA 12810 

GTCCAGATCT GGACTTTGCC TCTTTCCAGA 128 80 

GGACACCCTC GTGATGGGGG AGCAGCAGGT 12950 

CCTTGTGCAT GGTGCCCAGC ATGTCCCTGT 1302 0 

CCCACAGTCC CTGCTTCCCT CTCACAGCCT 13090 

CCACATTTCC TGGGCTCCCA GCACCTCTTC 13160 

AGCTGTGAAC TGTCCACTGC TTATTTTGCT 132 30 

CAGCCTCTGG CACAGCATCA GTGAATGTTA 13300 

TTCTCTCTAA ACACATTGCA AAGCCACAGA 13370 

GGTCCAATGC CAGAATATTC TGTGCTCCCA 13440 

TAAAAGCTCA GCAGTGGAGG CAGTGGTTCG 13510 

TACAGCAGAG GCTTGAAGGG CATCTGGGAG 13580 

AAGTGGTAAG ATGGGAATTT TCTTGTCCAG 13 650 

GTCAGAACTG ATGGACAGAA CAATAGAACA 13720 

TATGTGGCAC AGCTGATGGA AAAG AG AG T G 137 90 

AATAAGTTGT GTCTTTACAG CATATACCAG 138 60 

CCAGCAACAG AAATAAAACA AAAGACTCAA 13930 

AGGACACACA GGGAGGCGGA TGAAACCAGT 14 000 

TGCCTGAGCC ACAGTGAAAA TGGCCATTCC 14 070 

GAGGTCCTGC ACATTCATCC TCTCACTTTG 1414 0 

GGGGAGCAGC CGCCCTTGGT CACCCAGCTG 14210 

GGGCCCTTGC TCTGCCCGAG GACCCCACAC 142 80 

TCGTGTTGGG GATGGCTGTG AAAGAAGAAA 14 350 

GTCAAAGAAA TGCATGTGAA ACTGACAGCG 14 420 

AGACCTGTCC CCATCCCTCA TGCTGGCTCC 14 4 90 

GCTGGAAAGA CTTTTCTGGA AAGCAGCTTG 14 5 60 

AATTCCACTT CTGAAGTGAC CAGACATTAT 14 630 

GACTTGCCAC AGCAAGTCAC GAACCTGCCC 14 700 

CTCTGCCATT AAACATTTTT CAAAGAATTT 14 770 

AGCAGTGTTC AAAGCTGGAT GTAAAAGAAC 14 840 

TTTGGACATG GACATACATG GGCAGTGAGT 14 910 

CCTGCCCCTC TGGAGACACC ATGTGTGCCA 14 980 

CTCTTCCATC CCTGAGATTC AAACACAGTG 15050 

CTGAGTCACA CCTGTGTTCA CTCGAGGGAC 15120 

GCTGAGTTAT GTGCAGATCT CATCAGGGCA 15190 

CACTCAACAT CACTAGCCAG GTCCTGGTGG 152 60 

AGTCCATGGA GTGAGCACCC AGCCCCCTCG 15 330 

GAAGGCAGGA GGCTCTTTGG AGCAAGCTTT 154 00 

CATTCCCCCC TGTGTCTCAG CTATGCCCGG 15470 

AGGCTGGGAG GAACATGCGT CGCAAACTCT 15540 

TTTGCAGGTG AGCAGGCTGA TGGTCAGCAC 15610 

TGTGTGTGTG CGCGCGTGCC TGCAAGGCTG 15 680 

TACACGTGAG CACATACATG TGTGCATGTG 157 50 

GGCACAAGTG TGTGCACATG CGAATGCACA 15320 

TCACGTGAGG TGCATGCGTG TGGGTGTGCA 153 90 

CCTCGTGTTC ACCCCGCTAG GTCCTCAGCA 159 60 

TTGGTGGGCT GAGGCTCTGA AGCTGCAGCC 160 30 

TCTCCTGTGG GCTTCTGTGT CCACTCCCCC 16100 

TGTGGGCATC CGCGTCCACT CCCCCTCTCT 16 170 
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GTGGGCATCT GCGTCCACCT CCCCTCTCTG TGGGCATTTG 
CTTGGCCGAG CCTCGGGGGC AGGCAGATGA CACAGAGTCT 
GGTGAGGGCC AGGCCGGATT TCACTGGGAA GAGGGATAGT 
CATCTGAATG GATGATAAAG CAAAAAGTAA AAACTTAAAA 
TTTCTTGGCG ACTCTAGGTG AACAGCCTCC AGACGGTGTG 
GGCGTACAGG TGAGCCGCCA CCAAGGGGTG CAGGCCCAGC 
CTGACCCGGG GCTTCACCTT GGAACTCCTG GGTTTTAGGG 
TGCTGCCTGT GCACAGTTCT GTTCGCGTGG CTCTGTGCAA 
AGGAGCCGGT GTGGCCCCAG GTGTCCCCAC TGTGCCTGTG 
CCAGGGCAGC AGGGGCATGG GGTAAAGAGA TGTTTATGGG 
CTGAACAGTA GATGGGAGAT CAGATGCCCG GAGGATTTGG 
AGGTGAGGGT CGCTGGCCCC ACCCCCGGGA AGGTGCAGCA 
CACCTGTGCT CTGGGCATGG CTGTGCTCCT GGAACGTTCC 
CAAGAATCGA CAACTTTATC ACAGAGGGAA GGGCCAATCT 
AGTCAGGGCA GGTGGTGGCA CAAGCCTCGG GGCTGTACCA 
TCCACCTCAA CAGGCCTCCC GAGCCACTGG GAGCTGAATG 
GCTGAGAAGG AGTGTGAGCA TTTGTGTTAC CCAGGGCCGA 
TGAAATGAGG TCGTCGTCTA TCGTGGAAAC CCAGCAAGGG 
TACCATGAAA ATGGTTTTTA ACCCGAGTGC TTGCGCCTTC 
TGCATGTTAC CGCCTTTGCA CCAGCTCCAG AGGCTTGGGA 
GCTCAGACCG CCCTCCTCTC TGCCTTCTCT CTCTGCCTCA 
GTGCCTGGGC CCTCGTGCAA GCTGCTTGAC TCCTTTCCGG 
CACTGAGGAC TGGAGGTGTC TGACACTGTG GTTGACCCCA 
GGGCCATGAT GAGGTCAGAG GAGTTTTCCC AGGTGAAAAC 
GCCACCTGCT CCTCCCATAT TCAGCTCAGT CTTGTCCTCA 
GCTCCCGTAG AGGGCCTGGG CTCAGGGCAG GGCGGCTGAG 
GTCGCTTGAT TGGGTAGCCC TGAGGAGGCC GAGATGCGAT 
AGGCACGTGG AAGGCCCAGG AATCCCCTTC CCTCGAGGCA 
TTTCACGGCA GCCAGGCTGC AGTGGGCGAG GCTGTGGTGG 
CAAATCCGCT GGGGCTCGGC CTTCCTGGCC CGTGCTGGCC 
CCGACCTCTA GCAGGTGGCT ATTTCTCCCT TTGGAAGAGA 
TGGGTCAGGA GCGTGGCCGT GTGGCAACCC CGGGACCTTA 
GCCTGGCTTC CGTTGTTGCT AAATGGGGAA AAGACATCCC 
CGGGGTGCTG GCTTGACTGG TGTGATCTCA GGTCATTCCA 
TACATGGGGG GCTCAGGCAG TGGGTGAGAT GAGGTACACG 
CATGGGGGGC TCAGGCACTG GGTGAGATGA GGTACACGGG 
CGGGGGCTCT GATCACACGC ACATATGAGC ACATGTGCAC 
CACACCTGCC CCAAAGTCCC AGGAAGCTGA GAGGCCAAAG 
CACACCTGTA GTCCCAGCAC TTTGGGAGGC CGAGGCGAGA 
GCCTGAGCAA CATAGTAGAA CCCCATCTCT ATGAAAAATA 
GCGCCTGTAG TTCCAATACT TGGGAGGCTG AAGTGGGAGG 
GTGAGCTGAG ATTGCACCAC TGTACTGCAG CCTGGGTGAC 
GAAGACTGAC AAATGCAGTT TCTTGGAAAG AAACATTTAG 
TCGGTGTCTC GGTGTCAGTG AGATGAGATG ATGGGTCCTC 
ACCACAGGGG CGGGTGGCTC AGAAGGGATG CGCAGGACGT 
GAAGGGCAGG ATTCATGATA AGTACCTGCT GGTACACAAG 
CCTTCCCGGA ACAGGGGCTA ATCAGAAGCC AGCATGGGGG 
TCCACATGCG TGTTCATACA GATGGTGCAC AGAAACGCAG 
CTCGCACACA CAAGCACACA CACAGACATG CATGCATGCA 
AACCCATGCA TGTGCATTCA TGCACGCACA CAGGCACCGG 
CTGATTAGGA GGCCTTTCCT CTGACGCTGT CCGCCATCCT 
CCATTTCATC AGCAAGTTTG GAAGAACCCC ACATTTTTCC 
GCTACTCCAT CCTGAAAGCC AAGAACGCAG GTATGTGCAG 
CTGCTGGTGT TAGTGTGTCA GGAGACTGAG TGAATCTGGG 
GGAAGTGGTT TAACCCAACC ACTGTC AGGC TCGTCTGCCC 
TGGAAGGGAC AGGAGCTGTC TGGGAGCTGC CATCCTTCCC 
GCCTGGTCTC TCCTGTTTGC CCCATGGTGG GATTTGGGGG 
GATTGGGCTG TCTCCCGTCC ATGGCACTTA GGGCCCTTGT 
CCAGGCCCAG GCTACCCCAC CCCTCTCAGG AGCAGAGGCC 
CCTCTGCTTC CCAGTCACCG TCCTCTGCCC CTGGACACTT 
CTGAAATTCA AGCCATGTCG AACCTGCGGT CCTGAGCTTA 
GTGGAAATTT CACCTGGAGA AGCCGAAGAA AACATTTCTG 
AGCCAGAGAT GGAGCCACCC CGCAGACCGT CGGGTGTGGG 
TGGGCTGGGC CTGTGACTCC TCAGCCTCTG TTTTCCCCCA 
GGCCCTCTGC CCTCCGAGGC CGTGCAGTGG CTGTGCCACC 
G7GTCACCTA CGTGCCACTC CTGGGGTCAC TCAGGACAGG 
ACCTGCCCAG GGGTCATCCT TGAACGCCCT GTGTGGGGCG 
GCCCCCGGGC CTGACCCTGG GGGCCTGGAG CCACGCTGGC 
AGGCCACGGA GCCTGGCAGG GTCCCCAACT TCTTGAACCC 
CACGCTTGGG AGCCTTCTGA CCCCTGACCT GTGTCCTCTC 
TCCTGGGGTC CTGAGCAAGT TCTCTCCCCG CCCCGCCGCT 
CC 7GGTGGAG GGGTGTCTGT CCCTTCACTG AGGTTCCCAC 
TGCCCGGCCA CCCACACGTC CTAGGAGGGT TGGAGGATGC 
A77TTGGCCC CGCAGCCCAG ACGCAGCTGA GTCGGAAGCT 
CGGAGCCAAC CCGGCACTGC CCTC AGACTT CAAGACCATC 
GCCSAGAGCA GACACCAGCA GCCCTGTCAC GCCGGGCTCT 
CCAGGCCCGC ACCGCTGGGA GTCTGAGGCC TGAGTGAGTG 
GC7GAGTGTC CGGCTGAGGC CTGAGCGAGT GTCCAGCCAA 



CGTCCACTCC CTCTCCTGGT TCCTTCCTGT 16240 
TGACTCGCCC AGGGTGGTTC GCAGCTGCCG 16310 
TTCTTGTCAA AATGTTCCTC TTTCTTGTTC 16380 
TCCCAGAGAG GTTTCTACCG TTTCTCACTC 164 50 
CACCAACATC TACAAGATCC TCCTGCTGCA 16520 
CTCCAGGGAC CCTCCGCGCT CTGCTCACCT 165 90 
GCAAGGAATG TCTTACGTTT TCAGTGGTGC 16660 
AGCACCTGTT CTCCATCTCT GGGTAGTGGT 16730 
CACTGGCCGT GGGACGTCAT GGAGGCCATC 16800 
GAGTCTTAGC AGAGGAGGCT GGGAAGGTGT 16870 
GGTCTCAGCA AAGAGGGCCG AGGTGGGTGC 1694 0 
GAGCTGTGGC TCCCCACACA GCCCGGCCAG 17010 
CTGTCCTGGC TGGTCAGGGG GTGCCCCTGC 17 080 
GTGGAGGCCA CAGGGCCAGC TTCTGCCTGG 17150 
AAGGGCAGTC GGGCACCACA GGCCCGGGCC 17220 
CCAGGAGGCC GAAGCCCTCG CCCCATGAGG 172 90 
GGCTGCGCGA ATTACCGTGC ACACTTGATG 17 3 60 
CTCACGGGAG AGTTTTCCAT TACAAGGTCG 17 4 30 
ATGCTCTGGC AGGGAGGGCA GAGCCACAGC 17 500 
CCAGGCTGTC TCAGTTCCAG GGTGCGTCCG 17 57 0 
AATCTTCCCT CGTTTGCATC TCCCTGACGC 17 64 0 
AAACCCTTGG GGTGTGCTGG ATACAGGTGC 17710 
GGGTCCAGCT GGCGTGCTTG GGGCCTCCTT 17780 
TCCTGGGAAA CTCCCAGGGC CATGTGACCT 17 850 
TTTCCCCACC AGGGTCTCTA GCTCCGAGGA 17 920 
TTTCCCCACC CATGTGGGGA CCCTTGGGTA 17 990 
GGGCCACGGG CCGTTTCCAA ACACAGAGTC 180 60 
GGAGTGGGAG AACGGAGAGC TGGGCCCCGA 18130 
TCCACGTGGC GCTGGGGGCG GGGTCTGATT 18200 
GCGCCTCCAC ACGGGCTTGG GGTGGACGCC 18270 
GCCCCTCACC CATGCTAGGT GTTTCCCTCC 18340 
GGCTTATTTA TTTGTTTAAA AACATTCTGG 18 410 
ACCTCAGCAG AGTTACTGAG AGGCTGAAAC 184 80 
GAAGTGGCTC AGGAAGTCAG TGAGACCAGG 18 5 50 
GGGGGCTCAG GCAGTGGGTG AGGCCAGGTA 18 620 
GGGCTCAGGC AGAGGGTCAG ACCAGGTACA 18 690 
ATGTGCTGTT TCATGGTAGC CAGGTCTGTG .18 7 60 
ATGGAGGCTG ACAGGGCTGG CGCGGTGGCT 18 8 30 
GGATCCCTTG AGCCCAGGAG TTTAAGACCA 18 900 
AAAACAAAAA TTAGCTGAAC ATGGTGGTGT 18 970 
ATCACTTGAG CCCAGGAGGT GGAAGCTGCA 19040 
AGAGTGAGAG CCCATCTCAA CAACAACAAA 19110 
TAGGAACTTA ACCTACACAC AGAAGCCAAG 19180 
ACACCATCAC CCCAGACCCA GGGTTTATGC 19250 
TGATATACGA TGACATCAAG GTTGTCTGAC 19320 
GAACAATGGA TAAACTGGAA ACCTTAGAGG 19390 
GCTGGCATCC AGGATGGAGC TGCTTCAGCC 194 60 
TGTACCTGTG CACACACAGA CACGCAGCTA 19530 
TCCGTGTGTG TGCACCTGTG CCCATGAGGA 19600 
TGGGCCCATG CCCACACCCA CGAGCACCGT 19670 
CTCAGGTTTC ACGCATGTGT GCTGCAGCTC 19740 
TGCGCGTCAT CTCTGACACG GCCTCCCTCT 19810 
GTGCCTGGCC TCAGTGGCAG CAGTGCCTGC 19880 
CTTAGGAAGT TCTTACCCCT TTTCGCATCA 19950 
GCCCTCTCGT GGGGTGAGCA GAGCACCTGA 20020 
ACCTTGCTCT GCCTGGGGAA GCGCTGGGGG 20090 
GCCTGGCCTC TCCTGTTTGC CCTGTGGTGG 20160 
GCAAACCCAG GCCAAGGGCT TAGGAGGAGG 202 30 
GCGTATCACC ACGACAGAGC CCCGCGCCGT 20300 
TGTCCAGCAT CAGGGAGGTT TCTGATCCGT 2 0 370 
ACAGCTTCTA CTTTCTGTTC TTTCTGTGTT 204 4 0 
TCGTGACTCC TGCGGTGCTT GGGTCGGGAC 20510 
CAGCTTTCCG GTGTCTCCTG GGAGGGGAGC 20580 
GGGATGTCGC TGGGGGCCAA GGGCGCCGCC 20 650 
AAGCATTCCT GCTCAAGCTG ACTCGACACC 2 072 0 
CAAGTGTGGG TGGAGGCCAG TGCGGGCCCC 2 07 90 
AGCAGCCTCA GATGCTGCTG AAGTGCAGAC 208 60 
AGCCCTATGT GATTAAACGC TGGTGTCCCC 20930 
CTGCTTCCCA TCTCAGGGGC GATGGCTCCC 21000 
ACAGCCTCTT CCCTGGCTGC TGCCCTGAGC 21070 
CCAGCGTCAC TGGGCTGCCT GTCTGCTCGC 21140 
CAGCC AGGGC CACGAGGTGC AGGCCCTGCC 21210 
CACCTCTGGC CTCTTCTGGA ACGGAGTCTG 21280 
CCCGGGGACG ACGCTGACTG CCCTGGAGGC 21350 
CTGGAC TGA T GGCCACCCGC CCACAGCCAG 21420 
ACG7CCCAGG GAGGGAGGGG CGGCCCACAC 21490 
TTTGGCCGAG GCCTGCATGT CCGGCTGAAG 21560 
GGGCTGAGTG TCCAGCACAC CTGCCGTCTT 21630 
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CACTTCCCCA 
CTCCCCACAT 
CCACCCCCAC 
GTGTGCCCTG 
GCTGTGGGAG 
GTGCACTGCA 
GCCCATGGCC 
CCTGGCTGGG 
GAGCCCCCAC 
TTGGGCGGCG 
GGGGTGATGG 
GCCTCCCACC 
GGAGGTGGGG 
TCAGGTTGAA 
GTGGGTAGGG 
TCCTCTTATC 
TCTCCCAGTC 
CATCCAGACT 
GAAGGAACTG 
AGCCCCTCCT 
CAGCAGGTCC 
TTTGAGTGCA 
TGCTGCTGCT 
GTCGTAAATG 
GGCCTGGGGG 
GTGAGAGGTT 
TGAATCACAG 
CTCCGGGTGT 
AAAAGGTATT 
TATTATTATT 
GCTGTAGCCG 
TGTGAGCCAC 
CTGTCATCCC 
GGTAACATAG 
AGTCCCAGCT 
TGATTGTACC 
AAGGAGAAGG 
AAGGAGGCCT 
GGAAAGAAAA 
ACAAGCGTGT 
GTGACACCAG 
TGCACCTGCT 
CAAACTTTGG 
CCCACGTCCT 
AATCACGGCT 
GGTCCCTAGA 
GCTGGCTTTG 
AGCAATCCTC 
TCAGCTTTCC 
ACAGCAGCAA 
CCCCTGGG 



CAGGCTGGCG 
AGGAATAGTC 
CATCCAGGTG 
TACACAGGCG 
TAAAATACTG 
TAGACACCAC 
TGGCTGTGCA 
CCTGGGAGGT 
CCTGGAAGAC 
GGGATGATGG 
GGGGGGCTGG 
TGCAGCCGTG 
GGCAGGGGCA 
AGTCACATTC 
TGGGGCAGTG 
ATCTCCCAGT 
TCATCTGTCA 
TACCTCCCAG 
GAAGGATTGC 
CAGAAGTTGG 
CTGGTGGGGC 
GCCCGGACGT 
TCAGAGAATG 
CACTCTGGTG 
CGCCTTTGCC 
GGACAGAACA 
ACCAACaGGT 
TTTTTGTTGA 
TGCTTTGATA 
ATTAGAGATG 
CAAACCCCCA 
TGCCCTTGCC 
AGTAGTTTGG 
GGAGACCCCA 
GCTCGGGAGG 
ATCGCACTCC 
AGAAGAGAAG 
GCTAGGTGCT 
ACCCCAGCTC 
ATGGAGCGAG 
CCAGGACCCC 
GTAACCGTCG 
TGGGTTTCAG 
TCTCCTGGAA 
GCCAGTCAGC 
AGTGAGAGAG 
AGATGGAGGA 
CCCGGTCCTG 
GGCCTCCAGA 
ATGGAATAGC 



CTCGGCTCCA 
CATCCCCAGA 
GAGACCCTGA 
AGGACCCTGC 
AATATATGAG 
TGTATGCAAT 
TTTACGGAAG 
TTCTGATGCT 
ATAACAGTAA 
AGGGCCTGGC 
TCTGGGTGGC 
GATCCGGATG 
TGACACCATC 
CGCCTCTGGC 
GAGGGTGTGG 
CTCATCTCTC 
TCCTCTTACC 
GGCGGGTGCC 
AGAGAACAGG 
CTTGGGCCAC 
CTTATGGTAT 
GCCTGGTGTC 
TCTGAGTGAC 
CCTGGAGCCC 
CTGCAAACTG 
GGGCGGGGAC 
CAGGCCATTG 
AATTTTACTC 
TGGCTTAACT 
GTGTCTACTC 
GGCTCAAGTG 
TGGCACTTTT 
GAAGCCGAGG 
TCTCTACAAA 
CTGAGTGGGA 
AGCCTGGGCA 
AAGAAGGAAG 
AGGTAGACTG 
TTTGGACTTC 
TGAGTTCAAA 
TGAAAGGGAG 
ATGTTGGTGC 
AAGCCCCAGG 
CCTGTGAATG 
CGATCTTAAG 
GGAGGCAGGG 
GGGGGTCCCC 
AGGGCACACG 
GCTGTAAGAT 
AGTACAGGGA 



CCCCAGGGCC 
TTCGCCATTG 
GAAGGACCCT 
ACCTGGATGG 
TTTTTCAGTT 
TACAGAAGCC 
TCTATGAGTG 
GTGAGGCAGG 
GTCCAGGCCC 
CAGGGTGGCA 
GGGGAAGATG 
TGCTTCCCTG 
CTGTATAAAA 
CATTCTCTTA 
ACACAGGAGG 
ATCCTCTTAT 
ATCTCCCAGT 
AGGCTCGCAG 
AGGGGCGGCT 
ACGAAACCGA 
GGCCGGGTCC 
GGGGTGGGGG 
CGAGCCTAAT 
CCGTATAGGA 
GAAGGGAGCG 
TTCCCAGGAG 
TTCAGCTATC 
AGGATTACTT 
CACTAAGCAC 
TGTCACCCAG 
ATCCTCCGGC 
AAAAACCACT 
CAGAAGGATT 
AAATGCAAAA 
GGATCGCTTG 
ACAGAGTGAG 
AAGGAAAGAG 
TCAAATCTCA 
CTTAGGCCTG 
GCAGAAAGGG 
TGGTTGTTTT 
CAGGTGCCCA 
CACTTGTGGC 
TGTCACCCGC 
GTCATCCTGG 
GAGAGTCAGA 
AGCCAAGGAA 
GCCCTGCCCA 
GATGCGTTTG 
AATGAATACA 



AGCTTTTCCT 
TTCACCCCTC 
GGGAGCTCTG 
GGGTCCCTGT 
TTGAAAAAAA 
TGTGAGTGAA 
AATGGGGTTG 
AGGGGAAGGA 
GAAGGGCAGC 
GGGATGATGG 
GGGAAGCCTG 
GTGCACATCC 
TCCAGGATTC 
AGAGTAGACC 
CTTCAGGGTG 
CATCTCCCAG 
CTCATCTCTT 
TGGAGCTGGA 
CAGAGGGACG 
GGGCCCTGCG 
TACTGAGTGC 
CTTATGGCCA 
GTGTATGGTG 
GCTGTGAGGA 
GCCCCGGGCG 
CAGAGGCCGC 
CATCTTCTAC 
ATATTTTTTG 
CTACTTTATT 
GTTGTTAGTG 
CTCAGCTTCC 
ATGTAAGGTC 
GTCTGAGGCC 
AGTTATCCGG 
AGCCCGGGAG 
ACCCTGTCTC 
AAGAAGAAGG 
GAGCAAAATG 
AACTTCATCT 
AGGAGAAGCA 
CCTGCCTCAG 
CCTGGGAAGG 
AGGCACAATT 
AAGGCAGAGG 
ATTATCTGGT 
GAGGGGACGT 
TGGGGGCAGC 
CGCCTCGATT 
TGTTCAGCCA 
GGGACAGTTC 



CACCAGGAGC 
GCCCTGCCCT 
GGAATTTGGA 
GGGTCAAATT 
TCTCATGTTT 
CGGGGTGGTG 
TGGTCAGTGC 
GGGTAGGGGA 
AGGGATGCTG 
GGGCCCCAGC 
GCTGGGCCCC 
TCTGGGCCAT 
CTCCTCCTGA 
AGGATTCTGA 
GGGCTGGTGA 
TCTCATCTGT 
ATCCTCTTAT 
CATACGTCCT 
CAGTCTTGGG 
TGAGTGGCTC 
ACCTTGGACA 
CTGGATATGG 
GGCCCAAGTC 
AGGAGGGGCT 
CCGTGGGCGG 
TGCTCAGGCA 
AAAGCTCCAG 
CTAAAGTATT 
TGTCTGTTTT 
CAGTGGCACA 
CAGAGTGCTG 
AGGTCCAGTG 
AGGAGTTTGA 
GCGTGGGGTC 
GTCATGGCTG 
AAAAAAAAAA 
AAGAAGGAAG 
AAAATAACAA 
CAAGCAGCTT 
GGCAAGGGTG 
CCCCACGCTC 
ATGCTGTGCA 
ACAGCCCCTC 
CTGGTGAAGG 
GGGCCTGATA 
GAGAAGGACC 
CGCTCCATGC 
TCAGGCCAGT 
CTAAGCTGCA 
TCAGAGTGAC 



CCGGCTTCCA 
CCTTTGCCTT 
GTGACCAAAG 
GGGGGGAGGT 
GAATCCTAAT 
GTCAGTGCGG 
GGGCCCATGG 
TAGACAGTGG 
GGGGCCCAGC 
TGGGGTGGCA 
CTCCTCCCCT 
CAGCTTTCAT 
ACGCCCCAAC 
TCTCTGAAGG 
TGCTCTCTCA 
CTTCCTCTTA 
CTCCTAGTCT 
TCCTCAGGCA 
GTGAAGAAAC 
CAGAGCCTTC 
GGGCTTCTGG 
CGTCATTTAT 
CACAGACTGT 
CTTGGCAGCC 
ACGACCTCAA 
CACCTGGGTT 
ATTCCTGTTT 
AGACCCTTAA 
TATTTATTAT 
GTCATGGCTC 
GGATTACAGG 
GCTTCCACAC 
GACCAGCATG 
CAGCATCTGT 
CAGTGAGCTG 
AAAAAAAAAG 
AAAGAAGGAG 
AGTTTTAAAG 
CCTTCCACAG 
GAGGCTGTGG 
CTGCCGGTCC 
GGGGGCTTGC 
CCCAAAGATG 
CTGCAGGTGG 
TGGCCACAAG 
ACTGGCCACT 
TGGAAAAGCA 
GGGACCTGTT 
GTGATTCGTC 
TCTCAGCCCA 



21700 
21770 
21840 
21910 
21980 
22050 
22120 
22190 
22260 
22330 
22400 
22470 
22540 
22610 
22680 
22750 
22820 
22890 
22960 
23030 
23100 
23170 
23240 
23310 
23380 
23450 
23520 
23590 
23660 
23730 
23800 
23870 
23940 
24010 
24080 
24150 
24220 
24290 
24360 
24430 
24500 
24570 
24640 
24710 
24780 
24850 
24920 
24990 
25060 
25130 
25138 
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Comparison of the above-described genomic hTC sequence and the sequence of the 
hTC cDNA (Fig. 6; corresponding to SEQ ID NO 2) made it possible to elucidate the 
exon-intron structure of the hTC gene. The genomic organization of the hTC gene is 
illustrated diagrammatically in Fig. 7. The coding region of the hTC gene is 
composed of 16 exons which vary in size between 62 bp and 1354 bp (see Table 1). 
Exon 1 contains the translation start codon ATG. The translation stop codon TGA 
and the 3'-untranslated region lie on exon 16 (Fig. 8). No possible polyadenylation 
signal (AATAAA) was found either in exon 16 or in the 3195 bp of the following 
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3'-flanking region. The exon-intron transitions were determined on the basis of the 
consensus sequence 

5 '-Exon Intron 3 '-Exon 

Pre-mRNA A/C A G | G T A/G A . . . N C A G | G 

Frequency (%) 70 60 80 100 100 95 70 80 100 100 60 

and listed in Table 1. With the exception of the 5' splice site between exon 15 and 
intron 15, all the exon-intron transitions are in accord with the published (Shapiro 
and Senapathy, 1987) splice consensus sequence. The sizes of the introns are 
between 104 bp and 8616 bp. Since only part of intron 6 was isolated, it is not 
possible to determine the precise length of the hTC gene. Based on the part sequence 
of -4660 bp, which was obtained from intron 6, the minimum size of the hTERT 
gene is 37 kb. 
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Introns 1 -5 and the 5' region of intron 6, are contained in contig 1 : 
Intron 1: bp 1 1493-1 1596 (SEQ ID NO 4); 
Intron 2: bp 12951-21566 (SEQ ID NO 5); 
Intron 3: bp 21763-23851 (SEQ ID NO 6); 
5 Intron 4: bp 24033-24719 (SEQ ID NO 7); 
Intron 5: bp 24900-25393 (SEQ ID NO 8); 
5' region of intron 6: bp 25550-26414 (SEQ ID NO 9). 

The 3' region of intron 6, and introns 7-15, are located in contig 2 at the following 
10 positions: 

3' region of intron 6: bp 1-3782 (SEQ ID NO 10); 

Intron 7: bp 3879-4858 (SEQ ID NO 1 1); 

Intron 8: bp 4945-7429 (SEQ ID NO 12); 

Intron 9: bp 7544-9527 (SEQ ID NO 13); 
15 Intron 10: bp 9600-1 1470 (SEQ ID NO 14); 

Intron 1 1: bp 1 1660-15460 (SEQ ID NO 15; 

Intron 12: bp 15588-16467 (SEQ ID NO 16); 

Intron 13: bp 16530-19715 (SEQ ID NO 17); 

Intron 14: 19841-20621 (SEQ ID NO 18); 
20 Intron 1 5 : 20760-2 1 295 (SEQ ID NO 1 9). 

The 3'-untranscribed region is also located in contig 2 at position 21960-25138 (SEQ 
ID NO 20). 



25 



The individual sequences of the abovementioned introns are as follows: 



Le A 32 80 5-Ijj^ign Countries 



37 - 



Intron 1 (SEQ ID NO 4) 

GTGGGCCTCCCCGGGGTCGGCGTCCGGCTGGGGTTGAGGGCGGCCGGGGGGAACCAGCGACATGCGGAGAGCAGCGCAGG 
CGACTCAGGGCGCTTCCCCCGCAG 

Intron 2 (SEQ ID NO 5) 

gtgaggaggtggtggccgtcgagggcccaggccccagagctgaatgcagtaggggctcagaaaagggggcaggcagagcc 
ctggtcctcctgtctccatcgtcacgtgggcacacgtggcttttcgctcaggacgtcgagtggacacggtgatctctgcc 
tctgctctccctcctgtccagtttgcataaacttacgaggttcaccttcacgttttgatggacacgcggtttccaggcgc 
cgaggccagagcagtgaacagaggaggctgggcgcggcagtggagccgggttgccggcaatggggagaagtgtctggaag 
cacagacgctctggcgagggtgcctgcaggttacctataatcctcttcgcaatttcaagggtgggaatgagaggtgggga 
cgagaaccccctcttcctgggggtgggaggtaagggttttgcaggtgcacgtggtcagccaatatgcaggtttgtgttta 
agattt.aattgtgtgttgacggccaggtgcl-;gtggctcacgccggtaatcccagcactttgggaagctgaggcaggtgga 
tcacctgaggtcaggagtttgagaccagcctgaccaacatggtgaaaccctatctgtactaaaaatacaaaaattagctg 
ggcatggtggtgtgtgcctgtaatcccagctacttgggaggctgaggcaggagaatcacttgaacccaggaggcggaggc 
tgcagtgagctgagattgtgccattgtactccagcctgggcgacaagagtgaaactctgtctttaaaaaaaaaaagtgtt 
cgttgattgtgccaggacagggtagagggagggagataagactgttctccagcacagatcctggtcccatctttaggtat 
gaagagggccacatgggagcagaggacagcagatggctccacctgctgaggaagggacagtgtttgtgggtgttcagggg 
atggtgctgctgggccctgccgtgtccccaccctgtttttctggatttgatgttgaggaacctccgctccagcccccttt 

TGGCTCCCAGTGCTCCCAGGCCCTACCGTGGCAGCTAGAAGAAGTCCCGATTTCACCCCCTCCCCACAAACTCCCAAGAC 
ATGTAAGACTTCCGGCCATGCAGACAAGGAGGGTGACCTTCTTGGGGCTCTTTTTTTTCTTTTTTTCTTTTTATGGTGGC 
AAAAGTCATATAACATGAGATTGGCACTCCTAACACCGTTTTCTGTGTACAGTGCAGAATTGCTAACTCGGCGGTGTTTA 
CAGCAGGTTGCTTGAAATGCTGCGTCTTGCGTGACTGGAAGTCCCTACCCATCGAACGGCAGCTGCCTCACACCTGCTGC 
GGCTCAGGTGGACCACGCCGAGTCAGATAAGCGTCATGCAACCCAGTTTTGCTTTTTGTGCTCCAGCTTCCTTCGTTGAG 
GAGAGTTTGAGTTCTCTGATCAGGACTCTGCCTGTCATTGCTGTTCTCTGACTTCAGATGAGGTCACAATCTGCCCCTGG 
CTTATGCAGGGAGTGAGGCGTGGTCCCCGGGTGTCCCTGTCACGTGCAGGGTGAGTGAGGCGTTGCCCCCAGGTGTCCCT 
GTCACGTGTAGGGTGAGTGAGGCGCGGCCCCCGGGTGTCCCTGTCCCGTGCAGCGTGATTGAGGTGTGGCCCCCGGGTGT 
CCCTGTCACGTGTAGGGTGAGTGAGGCGCCATCCCCGGGTGTCCCTGTCACGTGTAGGGTGAGTGAGGCGTGGTCCCCGG 
GTGTCCCTGTCCCGTGCAGGGTGAGTGAGGCACTGTCCCCGGGTGTCCCTGTCACGTGCAGGGTGAGTGAGGCGCGGTCC 
CCGGGTGTCCCTCTCAGGTGTAGGGTGAGTGAGGCGCGGCCCCAGGGTGTCCCTGTCACGTGTAGGGTGAGTGAGGCACC 
GTCCCTGGGTGTCCCTCCCAGGTATAGGGTGAGTGAGGCACTGTCCCCGGGTGTCCCTGTCACGTGCAGGGTGAGTGAGG 
CGCGGCCCCCGGGTGTCCCTCTCAGGTGCAGGGTGAGTGAGGCGCTGTCCCTGGGTGTCCCTGTCTCGTGTAGGGTGAGT 
GAGGCTCTGTCCCCAGGTGTCCTTGGCGTTTGCTCACTTGAGCTTGCTCCTGAATGTTTGCTCTTTCTATAGCCACAGCT 
GCGCCGGTTGCCCATTGCCTGGGTAGATGGTGCAGGCGCAGTGCTGGTCCCCAAGCCTATCTTTTCTGATGCTCGGCTCT 
TCTTGGTCACCTCTCCGTTCCATTTTGCTACGGGGACACGGGACTGCAGGCTCTCGCCTCCCGCGTGCCAGGCACTGCAG 
CCACAGCTTCAGGTCCGCTTGCCTCTGTTGGGCCTGGCTTGCTCACCACGTGCCCGCCACATGCATGCTGCCAATACTCC 
TCTCCCAGCTTGTCTCATGCCGAGGCTGGACTCTGGGCTGCCTGTGTCTGCTGCCACGTGTTGCTGGAGACATCCCAGAA 
AGGGTTCTCTGTGCCCTGAAGGAAAGCAAGTCACCCCAGCCCCCTCACTTGTCCTGTTTTCTCCCAAGCTGCCCCTCTGC 
TTGGCCCCCTTGGGTGGGTGGCAACGCTTGTCACCTTATTCTGGGCACCTGCCGCTCATTGCTTAGGCTGGGCTCTGCCT 
CCAGTCGCCCCCTCACATGGATTGACGTCCAGCCACAGGTTGGAGTGTCTCTGTCTGTCTCCTGCTCTGAGACCCACGTG 
GAGGGCCGGTGTCTCCGCCAGCCTTCGTCAGACTTCCCTCTTGGGTCTTAGTTTTG«_ATTTCACTGATTTACCTCTGACG 
TTTCTATCTCTCCATTGTATGCTTTTTCTTGGTTTATTCTTTCATTCCTTTTCTAGCTTCTTAGTTTAGTCATGCCTTTC 
CCTCTAAGTGCTGCCTTACCTGCACCCTGTGTTTTGATGTGAAGTAATCTCAACATCAGCCACTTTCAAGTGTTCTTAAA 
ATACTTCAAAGTGTTAATACTTCTTTTAAGTATTCTTATTCTGTGATTTTTTTCTTTGTGCACGCTGTGTTTTGACGTGA 
AATCATTTTGATATCAGTGACTTTTAAGTATTCTTTAGCTTATTCTGTGATTTCTTTGAGCAGTGAGTTATTTGAACACT 
GTTTATGTTCA.AGATATGTAGAGTATCAAGATACGTAGAGTATTTTAAGTTATCATTTTATTATTGATTTCTAACTCAGT 
TGTGTAGTGGTCTGTATAATACCAATTATTTGAAGTTTGCGGAGCCTTGCTTTGTGATCTAGTGTGTGCATGGTTTCCAG 
AACTGTCCATTGTAAATTTGACATCCTGTCAATAGTGGGCATGCATGTTCACTATATCCAGCTTATTAAGGTCCP^GTGCA 
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AAGCTTCTGTCTCCTTCTAGATGCATGAAATTCCAAGAAGGAGGCCATAGTCCCTCACCTGGGGGATGGGTCTGTTCATT 
TCTTCTCGTTTGGTAGCATTTATGTGAGGCATTGTTAGGTGCATGCACGTGGTAGAATTTTTATCTTCCTGATGAGTGAA 
TCTTTTGGAGACTTCTATGTCTCTAGTAATCTAGTAATTCTTTTTTTAAATTGCTCTTAGTACTGCCACACTGGGCTTCT 
TTTGATTAGTATTTTCCTGCTGTGTCTGTTTTCTGCCTTTAATTTATATATATATATATATTTTTTTTTTTTTTGAGACA 
GAGTCTTGGTCTGTCGCCCAGGGTGAGTGCAGTGGTGTGATCACAGGTCAGTGTAACTTTTACCTTCTGGCCTGAGCCGT 
CCTCTCACCTCAGCCTCCTGAGTAGCTGGAACTGCAGACACGCACCGCTACACCTGGCTAATTTTTAAATTTTTTCTGGA 
GACAGGGTCTTGCTGTGTTGCCCAGGCTGGTCTCAAACTCTTGGACTCAAGGGATCCATCTACCTCGGCTTCCCAAAGTG 
CTGAATTACAGGCATGAGCCACCATGTCTGGCCTAATTTTCAACACTTTTATATTCTTATAGTGTGGGTATGTCCTGTTA 
ACAGCATGTAGGTGAATTTCCAATCCAGTCTGACAGTCGTTGTTTAACTGGATAACCTGATTTATTTTCATTTTTTTGTC 
ACTAGAGACCCGCCTGGTGCACTCTGATTCTCCACTTGCCTGTTGCATGTCCTCGTTCCCTTGTTTCTCACCACCTCTTG 
GGTTGCCATGTGCGTTTCCTGCCGAGTGTGTGTTGATCCTCTCGTTGCCTCCTGGTCACTGGGCATTTGCTTTTATTTCT 
CTTTGCTTAGTGTTACCCCCTGATCTTTTTATTGTCGTTGTTTGCTTTTGTTTATTGAGACAGTCTCACTCTGTCACCCA 
GGCTGGAGTGTAATGGCACAATCTCGGCTCACTGCAACCTCTGCCTCCTCGGTTCAAGCAGTTCTCATTCCTCAACCTCA 
TGAGTAGCTGGGATTACAGGCGCCCACCACCACGCCTGGCTAATTTTTGTATTTTTAGTAGAGATAGGCTTTCACCATGT 
TGGCCAGGCTGGTCTCAAACTCCTGACCTCAAGTGATCTGCCCGCCTTGGCCTCCCACAGTGCTGGGATTACAGGTGCAA 
GCCACCGTGCCCGGCATACCTTGATCTTTTAAAATGAAGTCTGAAACATTGCTACCCTTGTCCTGAGCAATAAGACCCTT 
AGTGTATTTTAGCTCTGGCCACCCCCCAGCCTGTGTGCTGTTTTCCCTGCTGACTTAGTTCTATCTCAGGCATCTTGACA 
CCCCCACAAGCTAAGCATTATTAATATTGTTTTCCGTGTTGAGTGTTTCTGTAGCTTTGCCCCCGCCCTGCTTTTCCTCC 
TTTGTTCCCCGTCTGTCTTCTGTCTCAGGCCCGCCGTCTGGGGTCCCCTTCCTTGTCCTTTGCGTGGTTCTTCTGTCTTG 
TTATTGCTGGTAAACCCCAGCTTTACCTGTGCTGGCCTCCATGGCATCTAGCGACGTCCGGGGACCTCTGCTTATGATGC 
ACAGATGAAGATGTGGAGACTCACGAGGAGGGCGGTCATCTTGGCCCGTGAGTGTCTGGAGCACCACGTGGCCAGCGTTC 
CTTAGCCAGTGAGTGACAGCAACGTCCGCTCGGCCTGGGTTCAGCCTGGAAAACCCCAGGCATGTCGGGGTCTGGTGGCT 
CCGCGGTGTCGAGTTTGAAATCGCGCAAACCTGCGGTGTGGCGCCAGCTCTGACGGTGCTGCCTGGCGGGGGAGTGTCTG 
CTTCCTCCCTTCTGCTTGGGAACCAGGACAAAGGATGAGGCTCCGAGCCGTTGTCGCCCAACAGGAGCATGACGTGAGCC 
ATGTGGATAATTTTAAAATTTCTAGGCTGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCAAGGCGGG 
TGGATCACGAGGTCAGGAGGTCGAGACCATCCTGGCCAACATGATGAAACCCCATCTGTACTAAAAACACAAAAATTAGC 
TGGGCGTGGTGGCGGGTGCCTGTAATCCCAGCTACTCGGGAGGCTGAGGCAGGAGAATTGCTTGAACCTGGGAGTTGGAA 
GTTGCAGTGAGCCGACATTGCACCACTGCACTCCAGCCTGGCAACACAGCGAGACTCTGTCTCAAAAAAAAAAAAAAAAA 
AAAAA.AAAAAAAT T CT AG T AG C C AC ATT AAAAAAG T AAAAAAG AAAAGGT G AAAT T AAT GT AAT AAT AG AT T T T ACTG AA 
GCCCAGCATGTCCACACCTCATCATTTTAGGGTGTTATTGGTGGGAGCATCACTCACAGGACATTTGACATTTTTTGAGC 
TTTGTCTGCGGGATCCCGTGTGTAGGTCCCGTGCGTGGCCATCTCGGCCTGGACCTGCTGGGCTTCCCATGGCCATGGCT 
GTTGTACCAGATGGTGCAGGTCCGGGATGAGGTCGCCAGGCCCTCAGTGAGCTGGATGTGCAGTGTCCGGATGGTGCACG 
TCTGGGATGAGGTCGCCAGGCCCTGCTGTGAGCTGGATGTGTGGTGTCTGGATGGTGCAGGTCAGGGGTGAGGTCTCCAG 
GCCCTCGGTGAGCTGGAGGTATGGAGTCCGGATGATGCAGGTCCGGGGTGAGGTCGCCAGGCCCTGCTGTGAGCTGGATG 
TGTGGTGTCTGGATGGTGCAGGTCAGGGGTGAGGTCTCCAGGCCCTCGGTAAGCTGGAGGTATGGAGTCCGGATGATGCA 
GGTCCGGGGTGAGGTCGCCAGGCCCTGCTGTGAGCTGGATGTGTGGTGTCTGGATGGTGCAGGTCTGGGGTGAGGTCACC 
AGGCCCTGCGGTGAGCTGGGTGTGCGGTGTCTGGATGGTGCAGGTCTGGAGTGAGGTCGCCAGACGGTGCCAGACCATGC 
GGTGAGCTGGATATGCGGTGTCCGGATGGTGCAGGTCTGGGGTGAGGTTGCCAGGCCCTGCTGTGAGTTGGATGTGGGGT 
GTCCGGATGCTGCAGGTCCGGTGTGAGGTCACCAGGCCCTGCTGTGAGCTGGATGTGTGGTGTCTGGATGGTGCAGGTCT 
GGGGTGAAGGTCGCCAGGCCCCTGCTTGTGAGCTGGATGTGTGGTGTCTGGATGGTGCAGGTCTGGAGTGAGGTCGCCAG 
GCCCTCGGTGAGCTGGATGTGCAGTGTCCAGATGGTGCAGGTCCGGGGTGAGGTCGCCAGACCCTGCGGTGAGCTGGATG 
TGCGGTGTCTGGATGGTGCAGGTCTGGAGTGAGGTCGCCAGGCCCTCGGTGAGCTGGATGTATGGAGTCCGGATGGTGCC 
GGTCCGGGGTGAGGTCGCCA.GACCCTGCTGTGAGCTGGATGTGCGGTGTCTGGATGGTACAGGTCTGGAGTGAGGTCGCC 
AGACCCTGCTGTGAGCTGGATATGCGGTGTCCGGATGGTGCAGGTCAGGGGTGAGGTCTCCAGGCCCTCGGTGAGCTGGA 
GGTATGGAGTCCGGATGATGCAGGTCCGGGGTGAGGTCGCCAGGCCCTGCTGTGAACTGGATGTGCGGCGTCTGGATGGT 
GCAGGTCTGGGGTGTGGTCGCCAGGCCCTCGGTGAGCTGGAGGTATGGAGTCCGGATGATGCAGGTCCGGGGTGAGGTCG 
CCAGGCCCTGCTGTGAGCTGGATGTGCGGCGTCTGGATGGTGCAGGTCTGGGGTGTGGTCGCCAGGCCCTCGGTGAGCTG 
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GAGGTATGGAGTCCGGATGATGCAGGTCCGGGGTGAGGTTGCCAGGCCCTGCTGTGAGCTGGATGTGCTGTATCCGGATG 
GTGCAGTCCGGGGTGAGGTCGCCAGGCCCTGCTGTGAGCTGGATGTGCTGTATCCGGATGGTGCAGGTCTGGGGTGAGGT 
CACCAGGCCCTGCGGTGAGCTGGTTGTGCGGTGTCCGGTTGCTGCAGGTCCGGGGTGAGTTCGCCAGGCCCTCGGTGAGC 
TGGATGTGCGGTGTCCCCGTGTCCGGATGGTGCAGGTCCAGGGTGAGGTCGCTAGGCCCTTGGTGGGCTGGATGTGCCGT 
5 GTCCGGATGGTGCAGGTCTGGGGTGAGGTCGCCAGGCCTTTGGTGAGCTGGATGTGCGGTGTCTGCATGGTGCAGGTCTG 
GGGTGAGGTCGCCAGGCCCTTGGTGGGCTGGATGTGTGGTGTCCGGATGGTGCAGGTCCGGCGTGAGGTCGCCAGGCCCT 
GCTGTGAGCTGGATGTGCGGTGTCTGGATGGTGCAGGTCCGGGGTGAGGTAGCCAAGGCCTTCGGTGAGCTGGATGTGGG 
GTGTCCGGATGGTGCAGGTCCGGGGTGAGGTCGCCAGGCCCTGCGGTTAGCTGGATATGCGGTGTCCGGATGGTGCAGGT 
CCGGGGTGAGGTCACCAGGCCCTGCGGTTAGCTGGATGTGCGGTGTCTGGATGGTGCAGGTCCGGGGTGAGGTCGCCAGG 

10 CCCTGCTGTGAGCTGGATGTGCTGTATCCGGATGGTGCAGGTCCGGGGTGAGGTCGCCAGGCCCTGCAGTGAGCTGGATG 
TGCTGTATCCGGATGGTGCAGGTCTGGCGTGAGGTCGCCAGGCCCTGCGGTTAGCTGGATATGCGGTGTCGGATGGTGCA 
GGTCCGGGGTGAGGTCACCAGGCCCTGCGGTTAGCTGGATGTGCGGTGTCCGGATGGTGCAGGTCTGGGGTGAGGTCGCC 
AGGCCCTGCTGTGAGCTGGATGTGCTGTATCCGGATGGTGCAGGTCCGGGGTGAGGTCGCCAGGCCCTGCGGTGAGCTGG 
ATGTGCTGTATCCGGATGGTGCAGGTCTGGCGTGAGGTCGCCAGGCCCTGCGGTGAGCTGGATGTGCAGTGTACGGATGG 

15 TGCAGGTCCGGGGTGAGGTCGCCAGGCCCTGCGGTGGGCTGTATGTGTGTTGTCTGGATGGTGCAGGTCCGGGGTGAGTT 
CGCCAGGCCCTGCGGTGAGCTGGATGTGTGGTGTCTGGATGCTGCAGGTCCGGGGTGAGTTCGCCAGGCCCTCGGTGAGC 
TGGATATGCGGTGTCCCCGTGTCCGAATGGTGCAGGTCCAGGGTGAGGTCGCCAGGCCCTTGGTGGGCTGGATGTGCCGT 
GTCCGGATGGTGCAGGTCTGGGGTGAGGTCGCCAGGCCCTTGGTGAGCTGGATGTGCGGTGTCCGGATGGTGCAGGTCCG 
GGGTGAGGTCACCAGGCCCTCGGTGATCTGGATGTGGCATGTCCTTCTCGTTTAAG 

20 

Intron 3 (SEQ ID NO 6) 

GTACTGTATCCCCACGCCAGGCCTCTGCTTCTCGAAGTCCTGGAACACCAGCCCGGCCTCAGCATGCGCCTGTCTCCACT 
TGCCTGTGCTTCCCTGGCTGTGCAGCTCTGGGCTGGGAGCCAGGGGCCCCGTCACAGGCCTGGTCCAAGTGGATTCTGTG 
CAAGGCTCTGACTGCCTGGAGCTCACGTTCTCTTACTTGTAAAATCAGGAGTTTGTGCCAAGTGGTCTCTAGGGTTTGTA 

25 * AAGCAGAAGGGATTTAAATTAGATGGAAACACTACCACTAGCCTCCTTGCCTTTCCCTGGGATGTGGGTCTGATTCTCTC 
YCTCTTTTTTTTTTCTTTTTTGAGATGGAGTCTCACTCTGTTGCCCAGGCTGGAGTGCAGTGGCATAATCTTGGCTCACT 
GCAACCTCCACCTCCTGGGTTTAAGCGATTCACCAGCCTCAGCCTCCTAAGTAGCTGGGATTACAGGCACCTGCCACCAC 
GCCTGGCTAATTTTTGTACTTTTAGGAGAGACGGGGTTTCACCATGTTGGCCAGGCTGGTCTCGAACTCATGACCTCAGG 
TGATCCACCCACCTTGGCCTCCCAAAGTGCTGGGTTTACAGGCTAAGCCACCGTGCCCAGCCCCCGATTCTCTTTTAATT 

30 CATGCTGTTCTGTATGAATCTTCAATCTATTGGATTTAGGTCATGAGAGGATAAAATCCCACCCACTTGGCGACTCACTG 
CAGGGAGCACCTGTGCAGGGAGCACCTGGGGATAGGAGAGTTCCACCATGAGCTAACTTCTAGGTGGCTGCATTTGAATG 
GCTGTGAGATTTTGTCTGCAATGTTCGGCTGATGAGAGTGTGAGATTGTGACAGATTCAAGCTGGATTTGCATCAGTGAG 
GGACGGGAGCGCTGGTCTGGGAGATGCCAGCCTGGCTGAGCCCAGGCCATGGTATTAGCTTCTCCGTGTCCCGCCCAGGC 
TGACTGTGGAGGGCTTTAGTCAGAAGATCAGGGCTTCCCCAGCTCCCCTGCACACTCGAGTCCCTGGGGGGCCTTGTGAC 

35 ACCCCATGCCCCAAATCAGGATGTCTGCAGAGGGAGCTGGCAGCAGACCTCGTCAGAGGTAACACAGCCTCTGGGCTGGG 
GACCCCGACGTGGTGCTGGGGCCATTTCCTTGCATCTGGGGGAGGGTCAGGGCTTTCCCTGTGGGAACAAGTTAATACAC 
AATGCACCTTACTTAGACTTTACACGTATTTAATGGTGTGCGACCCAACATGGTCATTTGACCAGTATTTTGGAAAGAAT 
TTAATTGGGGTGACCGGAAGGAGCAGACAGACGTGGTGGTCCCCAAGATGCTCCTTGTCACTACTGGGACTGTTGTTCTG 
CCTGGGGGGCCTTGGAGGCCCCTCCTCCCTGGACAGGGTACCGTGCCTTTTCTACTCTGCTGGGCCTGCGGCCTGCGGTC 

40 AGGGCACCAGCTCCGGAGCACCCGCGGCCCCAGTGTCCACGGAGTGCCAGGCTGTCAGCCACAGATGCCCAGGTCCAGGT 
GTGGCCGCTCCAGCCCCCGTGCCCCCATGGGTGGTTTTGGGGGAAAAGGCCAAGGGCAGAGGTGTCAGGAGACTGGTGGG 
CTCATGAGAGCTGATTCTGCTCCTTGGCTGAGCTGCCCTGAGCAGCCTCTCCCGCCCTCTCCATCTGAAGGGATGTGGCT 
CTTTCTACCTGGGGGTCCTGCCTGGGGCCAGCCTTGGGCTACCCCAGTGGCTGTACCAGAGGGACAGGCATCCTGTGTGG 
AGGGGCATGGGTTCACGTGGCCCCAGATGCAGCCTGGGACCAGGCTCCCTGGTGCTGATGGTGGGACAGTCACCCTGGGG 

45 GTTGACCGCCGGACTGGGCGTCCCCAGGGTTGACTATAGGACCAGGTGTCCAGGTGCCCTGCAAGTAGAGGGGCTCTCAG 
AGGCGTCTGGCTGGCATGGGTGGACGTGGCCCCGGGCATGGCCTTCAGCGTGTGCTGCCGTGGGTGCCCTGAGCCCTCAC 
TGAGTCGGTGGGGGCTTGTGGCTTCCCGTGAGCTTCCCCCTAGTCTGTTGTCTGGCTGAGCAAGCCTCCTGAGGGGCTCT 
CTATTGCAG 
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Intron 4 (SEQ ID NO 7) 

GTGGCTGTGCTTTGGTTTAACTTCCTTTTTAAACAGAAGTGCGTTTGAGCCCCACATTTGGTATCAGCTTAGATGAAGGG 
CCCGGAGGAGGGGCCACGGGACACAGCCAGGGCCATGGCACGGCGCCAACCCATTTGTGCGCACAGTGAGGTGGCCGAGG 
TGCCGGTGCCTCCAGAAAAGCAGCGTGGGGGTGTAGGGGGAGCTCCTGGGGCAGGGACAGGCTCTGAGGACCACAAGAAG 
5 CAGCCGGGCCAGGGCCTGGATGCAGCACGGCCCGAGGTCCTGGATCCGTGTCCTGCTGTGGTGCGCAGCCTCCGTGCGCT 
TCCGCTTACGGGGCCCGGGGACCAGGCCACGACTGCCAGGAGCCCACCGGGCTCTGAGGATCCTGGACCTTGCCCCACGG 
CTCCTGCACCCCACCCCTGTGGCTGCGGTGGCTGCGGTGACCCCGTCATCTGAGGAGAGTGTGGGGTGAGGTGGACAGAG 
GTGTGGCATGAGGATCCCGTGTGCAACACACATGCGGCCAGGAACCCGTTTCAAACAGGGTCTGAGGAAGCTGGGAGGGG 
TTCTAGGTCCCGGGTCTGGGTGGCTGGGGACACTGGGGAGGGGCTGCTTCTCCCCTGGGTCCCTATGGTGGGGTGGGCAC 
10 TTGGCCGGATCCACTTTCCTGACTGTCTCCCATGCTGTCCCCGCCAG 

Intron 5 (SEQ ID NO 8) 

GTGGGTGCCGGGGACCCCCGTGAGCAGCCCTGCTGGACCTTGGGAGTGGCTGCCTGATTGGCACCTCATGTTGGGTGGAG 
GAGGTACTCCTGGGTGGGCCGCAGGGAGTGCAGGTGACCCTGTCACTGTTGAGGACACACCTGGCACCTAGGGTGGAGGC 
15 CTTCAGCCTTTCCTGCAGCACATGGGGCCGACTGTGCACCCTGACTGCCCGGGCTCCTATTCCCAAGGAGGGTCCCACTG 
GATTCCAGTTTCCGTCAGAGAAGGAACCGCAACGGCTCAGCCACCAGGCCCCGGTGCCTTGCACCCCAGTCCTGAGCCAG 
GGGTCTCCTGTCCTGAGGCTCAGAGAGGGGACACAGCCCGCCCTGCCCTTGGGGTCTGGAGTGGTGGGGGTCAGAGAGAG 
AGTGGGGGACACCGCCAGGCCAGGCCCTGAGGGCAGAGGTGATGTCTGAGTTTCTGCGTGGCCACTGTCAGTCTCCTCGC 
CTCCACTCACACAG 

20 

5^-region intron 6 (SEQ ID NO 9) 

GTAAGGTTCACGTGTGATAGTCGTGTCCAGGATGTGTGTCTCTGGGATATGAATGTGTCTAGAATGCAGTCGTGTCTGTG 
ATGCGTTTCTGTGGTGGAGGTACTTCCATGATTTACACATCTGTGATATGCGTGTGTGGCACGTGTGTGTCGTGGTGCAT 
GTATCTGTGGCGTGCATATTTGTGGTGTGTGTGTGTGTGGCACGTGTGTGTCCATGGTGTGTGTGCCTGTGGTGTGCATG 

25 TGTGTGTGTCTGTGACACGTGCATGTTCATGCTGTGTGCTGCATGTCTGTGATGTGCCTATTTGTGGTGTGTGTGTGCAT 
GTGTCCGTGACATATGCGTGTCTATGGCATGGGTGTGTGTGGCCCCTTGGCCTTACTCCTTCCTCCTCCAGGCATGGTCC 
GCACCATTGTCCTCACGCTCTCGGGTGCTGGTTTGGGGAGCTCCACATTCAGGGTCCTCACTTCTAGCATGGGTGCCCCT 
GTCCTGTCACAGGGCTGGGCCTTGGAGACTGTAAGCCAGGTTTGAGAGGAGAGTAGGGATGCTGGTGGTACCTTCCTGGA 
CCCCTGGCACCCCCAGGACCCCAGTCTGGCCTATGCCGGCTCCATGAGATATAGGAAGGCTGATTCAGGCCTCGCTCCCC 

30 GGGACACACTCCTCCCAGAGCGGCCGGGGGCCTTGGGGCTCGGCAGGGGTGAAAGGGGCCCTGGGCTTGGGTTCCCACCC 
AGTGGTCATGAGCACGCTGGAGGGGTAAGCCCTCAAAGTCGTGCCAGGCCGGGGTGCAGAGGTGAAGAAGTATCCCTGGA 
GCTTCGGTCTGGGGAGAGGCACATGTGGAAACCCACAAGGACCTCTTTCTCTGACTTCTTGAGCT 

3 x -region intron 6 (SEQ ID NO 10) 

35 TGTGGGATTGGTTTTCATGTGTGGGATAGGTGGGGATCTGTGGGATTGGTTTTTATGAGTGGGGTAACACAGAGTTCAAG 
GCGAGCTTTCTTCCTGTAGTGGGTCTGCAGGTGCTCCAACAGCTTTATTGAGGAGACCATATCTTCCTTTGAACTATGGT 
CGGGTTTATAGTAAGTCAGGGGTGTGGAGGCCTCCCCTGGGCTCCCTGTTCTGTTTCTTCCACTCTGGGGTCGTGTGGTG 
CCTGCTGTGGTGTGTGGCCGGTGGGCAGGGCTTCCAGGCCTCCTTGTGTTCATTGGCCTGGATGTGGCCCTGGCTACGCT 
CCGTCCTTGGAATTCCCCTGCGAGTTGGAGGCT7TCTTTCTTTCTTTTTTTCTTTCTTTTTTTTTTTTTTTGATAACAGA 

40 GTCTCGCTCTTTTTTGCCCAGGCTGGAGTGGTTTGGCGTGATCTTGGCTCACTGCAACCTGTGCTTCCTGAGTTCAAGCA 
ATTCTCTTGCCTCAGCCTCCCAAGTAGCTGGAATTATAGGCGCCCACCACCATGCTGACTAATTTTTGTAATTTTAGTAG 
AGACGAGGTTTCTCCATGTTGGCCAGGCTGGTC7CGAACTCCTGACCTCAGGTGATCCTCCCACCTCGGCCTCCCAAAGT 
GCTGGGATGACAGGTGTGAACCGCCGCGCCCGGCCGAGACTCGCTTCCTGCAGCTTCCGTGAGATCTGCAGCGATAGCTG 
CCTGCAGCCTTGGTGCTGACAACCTCCGTTTTCCTTC7CCAGGTCTCGCTAGGGGTCTTTCCATTTCATGACTCTCTTCA 
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CAGAAGAGTTTCACGTGTGCTGATTTCCCGGCTGTTTCCTGCGTAATTGGTGTCTGCTGTTTATCGATGGCCTCCTTCCA 
TTTCCTTTAGGCTTTGTTTATTGTTGTTTTTCCGGCTCCTTGAAGGAAAAGTTTCGATTATGGATGTTTGAACTTTCTTT 
TCTAAACAAGCATCTGAAGTTGCCGTTTTCCCTCTAAAGCAGGGATCCCGAGGCCCCTGGCTGTGGAGTGGCACCGGTCT 
GGGGCCTGTTAGGAACCCGGCGCACAGCGGGAGGCTAGGTGGGGTGTGGGGAGCCAGCGTTCCCGCCTGAGCCCCGCCCC 
TCTCAGATCAGCAGTGGCATGCGGTGCTCAGAGGCGCACACACCCTACTGAGAACTGTGCGTGAGAGGGGTCTAGATTCT 
GTGCTCCTTATGGGAATCTAATGCCTGATGATCTGAGGTGGAACCGTTTGCTCCCAAAACCATCCCCTTCCCCACTGCTG 
TCCTGTGGAAAAATCGTCTTCCACGAAACCAGTCCCTGGTACCACAATGGTTGGGGACCCTGTGCTAAAGACCTGCTTCA 
GCAGCCTCTCGTCAGTGTTGATATATTGGCTTTTCTGTGTTGAGTCCAGAATAATTACGGATTTCTGTGATGCTTTCCGC 
CGACCTCAGACCCATGGGCTATTTGTGGGCGTGTTGCCTGCTCCTGGGTTGGGAAGGGTGCAGGCCCCATGTACCTTCCT 
GTTACTGCCTTCCAGGTTGGTTCTCAGGGTTGAATCGTACTCGATGTGGTTTTAGCCCACGGCCCTGCCGCCAGCTCCTG 
GGGGCTGGGGAACATGCTGAAGCACAGAGTCACCGTGCGCGTCTTTTGATGCCTCACAAGCTCGAGGCCTCCTGTGTCCG 
TGTTAGTGTGTGTCACGTGCCTGCTCACATCCTGTCTTGGGGACGCAGGGGCTTAGCAGGTCCCGTAGTAAATGACAAGC 
GTCCTGGGGGAGTCTGCAGAATAGGAGGTGGGGGTGCCGGTCTCTCTCCCGCGTCTTCAGACTCTTCTCCTGCCTGTGCT 
GTGGCTGCACCTGCATCCCTGCAATCCCTCCAGCACTGGGCTGGAGAGGCCCGGGAGCTCGAGTGCCACTTGTGCCACGT 
GACTGTGGATGGCAGTCGGTCACGGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTTGGTCACAGGGGTCTGATGTGTG 
GTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGG 
ATGGCGGTCGTGGGGTCTGATGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGGTGACTGTGGATGGCGGTCGTG 
GGGTCTGATGTGGTGACTGTGGATGGCAGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATG 
TGGTGACTGTGGATGGCAGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACT 
GTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGG 
CGGTCGTGGGGTCTGATGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGTGATCGGTCA 
CAGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGTGATCGGTCACAG 
GGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTTGGTCCCGGGGG 
TCTGATGTGTGGTGACTGTGGATGGCGATCGGTCACAGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCT 
GATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGGT 
GACTGTGGATGGCGGTCGTGGGGTCTGATGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGAT 
GGCGGTTGGTCCCGGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGGTGACTGTGGATGGCAG 
TCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGG 
TCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGT 
GGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGTGATCGGTCACAGGGGTCTGATGTGTGGT 
GACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGGTGACTGTGGAT 
GGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTAGGGTCTGATGTGTGGTGACTGTGGATGGCAGTCG 
GTCACAGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGG 
GGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGAT 
GTGGTGACTGTGGATGGTGATCGGTCACAGGGGTCTGATGTGTGGTAGCTGCAGGTGGAGTCCCAGGTGTGTCTGTAGCT 
ACTTTGCGTCCTCGGCCCCCCGGCCCCCGTTTCCCAAACAGAAGCTTCCCAGGCGCTCTCTGGGCTTCATCCCGCCATCG 
GGCTTGGCCGCAGGTCCACACGTCCTGATCGGAAGAAACAAGTGCCCAGCTCTGGCCGGGGCAGGCCACATTTGTGGCTC 
ATGCCCTCTCCTCTGCCGGCAG 

Intron 7 (SEQ ID NO 11) 

GTCTGGGCACTGCCCTGCAGGGTTGGGCACGGACTCCCAGCAGTGGGTCCTCCCCTGGGCAATCACTGGGCTCATGACCG 
GACAGACTGTTGGCCCTGGGGGGCAGTGGGGGGAATGAGCTGTGATGGGGGCATGATGAGCTGTGTGCCTTGGCGAAATC 
TGAGCTGGGCCATGCCAGGCTGCGACAGCTGCTGCATTCAGGCACCTGCTCACGTTTGACTGCGCGGCCTCTCTCCAGTT 
CCGCAGTGCCTTTGTTCATGATTTGCTAAATGTCTTCTCTGCCAGTTTTGATCTTGAGGCCAAAGGAAAGGTGTCCCCCT 
CCTTTAGGAGGGCAGGCCATGTTTGAGCCGTGTCCTGCCCAGCTGGCCCCTCAGTGCTGGGTCTGAGGCCAAAGGAAACG 
TGTCCCCCTTCTTAGGAG jACGGGCCGTGTTTGAGCCACGCCCCGCTGAGCGGGCCTCTCAGTGCTGGGTCTGTCCACGT 
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GGCCCTGTGGCCCTTTGCAGATGTGGTCTGTCCACGTGGCCCTGTGGCTCTTTGCAGATGCCTGTTAGCACTTGCTCGGC 
TCTAGGGGACAGTCGTGTCCACCGCATGAGGCTCAGAGACCTCTGGGCGAATTTCCTTGGCTCCCAGGGTGGGGGTGGAG 
GTGGCCTGGGCTGCTGGGACCCAGACCCTGTGCCCGGCAGCTGGGCAGCAACTCCTGGATCACATATGCCATCCGGGCCA 
CGGTGGGCTGTGTGGGTGTGAGCCCAGCTGGACCCACAGGTGGCCCAGAGGAGACGTTCTGTGTCACACACTCTGCCTAA 
GCCCATGTGTGTCTGCAGAGACTCGGCCCGGCCAGCCCACGATGGCCCTGCATTCCAGCCCAGCCCCGCACTTCATCACA 
AACACTGACCCCAAAAGGGACGGAGGGTCTTGGCCACGTGGTCCTGCCTGTCTCAGCACCCACCGGCTCACTCCCATGTG 
TCTCCCGTCTGCTTTCGCAG 

Intron 8 (SEQ ID NO 12) 

GTGAGTCAGGTGGCCAGGTGCCATTGCCCTGCGGGTGGCTGGGCGGGCTGGCAGGGCTTCTGCTCACCTCTCTCCTGCCC 
CTTCCCCACTGNCCTTCTGCCCGGGGCCACCAGAGTCTCCTTTTCTGGCCCCCGCCCCCTCCGGCTCCTGGGCTGCAGGC 
TCCCGAGGCCCCGGAAACATGGCTCGGCTTGCGGCAGCCGGAGCGGAGCAGGTGCCACACGAGGCCTGGAAATGGCAAGC 
GGGGTGTGGAGTTGCTCCTGCGTGGAGGACGAGGGGCGGGGGGTGTGTCTGGGTCAGGTGTGCGCCGAGCGTTTGAGCCT 
GCAGCTTGTCAGCTCCAAGTTACTACTGACGCTGGACACCCGGCTCTCACACGCTTGTATCTCTCTCTCCCGATACAAAA 
GGATTTTATCCGATTCTCATTCCTGTCCCTGTCGTGTGACCCCCGCGAGGGCGCGGGCTCTTCTCTCTGTGACTAGATTT 
CCCATCTGGAAAGTGCGGGGTTGACCGTGTAGTTTGCTCCTCTCGGGGGGCCTGTGGTGGCCATGGGGCAGGCGGCCTGG 
GAGAGCTGCCGTCACACAGCCACTGGGTGAGCCACACTCACGGTGGTAGAGCCACAGTGCCTGGTGCCACATCACGTCCT 
CTGGATTTTAAGTAAAACCACACACCTCCCGGCAGGCATCTGCCTGCGACCCTGTGTGTGCCTGGGGAGAGTGGTAGCAC 
GGAGGAAATTCGTGCACACTCAAGGTCATCAGCAAGGTCATCCGCAGTCAGGTGGAACGTGGAGGCCTCTCTCTGGGATC 
GTCTCCAGCGGATAAAGGACTGTGCACAGCTTCGGAAGCTTTTATTTAAAAATATAACTATTAATTATTGCATTATAAGT 
AATCACTAATGGTATCAGCAATTATAATATTTATTAAAGTATAATTAGAAATATTAAGTAGTACACACGTTCTGGAAAAA 
CACAAATTGCACATGGCAGCAGAGTGAATTTTGGCCGAGGGACACGTGTGCACATGTGTGTAAGCGGCCCCCAGGCCCAC 
AGAATTCGCTGACAAAGTCACCTCCCCAGAGAAGCCACCACGGGCCTCCTTCGTGGTCGTGAATTTTATTAAGATGGATC 
AAGTCACGTACCGTCCACGTGTGGCAGGGCTTTGGGGAATGTGAGGTGATGACTGCGTCCTCATGCCCTGACAGACAGGA 
GGTGACTGTGTCTGTCCTGTCCCTAGGACACGGACAGGCCCGAAGCTCTAGTCCCCATCGTGGTCCAGTTTGGCCTCTGA 
ATAAAAACGTCTTCAAAACCTGTTGCCCCAAAAACTAAGAACAGAGAGAGTTTCCCATCCCATGTGCTCACAGGGGCGTA 
TCTGCTTGCGTTGACTCGCTGGGCTGGCCGGACTCCTAGAGTTGGTGCGTGTGCTTCTGTGCAAAAAGTGCAGTCCTCTT 
GCCCATCACTGTGATATCTGCACCAGCAAGGAAAGCCTCTTTTCTTTTCTTTCTTTTTTTTTTTTTGAGACGGAACGTCA 
CTGTTGTCTGCCTGGGCTTGAGTGCAGTGGCGCGATCTCAACTCACTGCAACCTCCGCCTCCCGGGTTCCAGCATTTCTC 
CTGCCTCAGCCTCCCGAGCAGCTGAGATTACAGGCACCCACCCCCTGCGCCTGGCTAATTTTTGTATTTTTAGTAGAGAG 
GGGTTTTTGCCATGTTGGCCAGGCTGGTCTCGAACTCCTGACCTCAGGTGATCCACCCACCTCGGCCTCCCAAAGTGCTG 
GG ATT AC AGGTGTGAGCCATCACGCCCAGCCGGAAAGCCTCTTTTTAAGGTGACCACCTATAGCGCTTCCCGAAAATAAC 
AGGTCTTGTTTTTGCAGTAGGCTGCAAGCGTCTCTTAGCAACAGGAGTGGCGTCCTGTGGGCTCTGGGGATGGCTGAGGG 
TCGCGTGGCAGCCATGCCTTCTGTGTGCACCTTTAGGTTCCACGGGGCTATTCTGCTCTCACTGTTTGTCTGAAAACGCA 
CGCTTGGCATCCTTGTTTGGAGAGTTTCTGCTTCTCGTTGGTCATGCTGAAACTAGGGGCAAGGTTGTATCCGTTGGCGC 
GCAGCGGCTACATGTAGGGTCATGAGTCTTTCACCGTGGACAAATTCCTTGAAAAAAAAAAAAGGAGTCCGGTTAAGCAT 
TCATTCCGGGTCAAGTGTCTGGTTCTGTGAATAAACTCTAAGATTTAAGAAACCTTAATGAAAGAAAACCTTGATGATTC 
AGAGCAAGGATGTGGTCACACCTGTGGCTGGATCTGTTTCAGCCGCCCCAGTGCATGGTGAGAGTGGGGAGCAGGGATTG 
TTTGTTCAGAGGTCTCATCTGGTATGTTTCTGAGGTGTTTGCCGGCTGAATGGTAGACGTGTCGTTTGTGTGTATGAGGT 
TCTGTGTCTGTGTGTGGCTCGGTTTGAGTGTACGCATGTCCAGCACATGCCCTGCCCGTCTCTCACCTGTGTCTTCCCGC 
CCCAG 

Intron 9 (SEQ ID NO 13) 

GTGAGGCCTCCTCTTCCCCAGGGGGGCTTGGGTGGGGGTTGATTTGCTTTTGATGCATTCAGTGTTAATATTCCTGGTGC 
TCTGGAGACCATGACTGCTCTGTCTTGAGGAACCAGACAAGGTTGCAGCCCCTTCTTGGTATGAAGCCGCACGGGAGGGG 
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TTGCACAGCCTGAGGACTGCGGGCTCCACGCAGGCTCTGTCCAGCGGCCATGTCCAGAGGCCTCAGGGCTCAGCAGGCGG 
GAGGGCCGCTGCCCTGCATGATGAGCATGTGAATTCAACACCGAGGAAGCACACCAGCTTCTGTCACGTCACCCAGGTTC 
CGTTAGGGTCCTTGGGGAGATGGGGCTGGTGCAGCCTGAGGCCCCACATCTCCCAGCAGGCCCTCGACAGGTGGCCTGGA 
CTGGGCGCCTCTTCAGCCCATTGCCCATCCCACTTGCATGGGGTCTACACCCAAGGACGCACACACCTAAATATCGTGCC 
AACCTAATGTGGTTCAACTCAGCTGGCTTTTATTGACAGCAGTTACTTTTTTTTTTTTAATACTTTAAGTTCTAGGGTAC 
ATGTGCACGACGTGCAGGTTAGTTACATATGTATACATGTGCCATGTTGGTGTGCTGCACCCATTAACTCATCATTTACA 
TTAGGTATATCTCCTAATGCTATCCCTCCCCACTCCCCCCATCCCATGACAGGCCCTGGTGTGTGATGTTCCCCACCCTG 
TGTCCAAGTGTTCTCATTGTTCAGTTCCCACCTGTGAGTGAGAACATGTGGTGTTTGGTTTTCTTTCCTTGCAATAGTTT 
GCTCAGAGTGATGGTTTCCAGCTTCGTCCATGTCCCTACAAAGGACATGAACTCATCCTTTTTTATGACTGCATAGTATT 
CCGTGGTGTATATGTGCCACATTTTCTTAATCCAGTCTATCATCGATGGACATTTGGGTTGGTTGCAAGTCTTTGCTACT 
GTGAATAGTGCCGCAATAAACATACGTGTGCATGTGTCTTTATAGCAGCATGATTTATAATCCTTTGGGTATATACCCAG 
TAATGGGATGGCTGGGTCAAATGGTATTTCTAGTTCTAGATCCTTGAGGAATCACCACACTGTCTTCCACAATGGTTGAA 
CTAGTTTACACTCCCACCAACAGTGTAAAAGTGTTCTGGTGCTGGAGAGGATGTGGACAGCAGTTATTTTTTTATGAAAA 
TAGTATCACTGAACAAGCAGACAGTTAGTGAAGGATGCGTCAGGAAGCCTGCAGGCCACACAGCCATTTCTCTCGAAGAC 
TCCGGGTTTTTCCTGTGCATCTTTTGAAACTCTAGCTCCAATTATAGCATGTACAGTGGATCAAGGTTCTTCTTCATTAA 
GGTTCAAGTTCTAGATTGAAATAAGTTTATGTAACAGAAACAAAAATTTCTTGTACACACAACTTGCTCTGGGATTTGGA 
GGAAAGTGTCCTCGAGCTGGCGGCACACTGGTCAGCCCTCTGGGACAGGATACCTCTGGCCCATGGTCATGGGGCGCTGG 
GCTTGGGCCTGAGGGTCACACAGTGCACCATGCCCAGCTTCCTGTGGATAGGATCTGGGTCTCGGATCATGCTGAGGACC 
ACAGCTGCCATGCTGGTAAAGGGCACCACGTGGCTCAGAGGGGGCGAGGTTCCCAGCCCCAGCTTTCTTACCGTCTTCAG 
TTATTTTTCCCTAAGAGTCTGAGAAGTGGGGCCGCGCCTGATGGCCTTCGTTCGTCTTCAGCTGGCACAGAATTGCACAA 
GCTGATGGTAAACACTGAGTACTTATAATGAATGAGGAATTGCTGTAGCAGTTAACTGTAGAGAGCTCGTCTGTTGGAAA 
GAAATTTAAGTTTTTCATTTAACCGCTTTGGAGAATGTTACTTTATTTATGGCTGTGTAAATTGTTTGACATTCAGTCCC 
TCGTAGACAGATACTACGTAAAAAGTGTAAAGTTAACCTTGCTGTGTATTTTCCCTTATTTTAG 

Intron 10 (SEQ ID NO 14) 

GTGAGGCCCGTGCCGTGTGTCTGTGGGGACCTCCACAGCCTGTGGGCTTTGCAGTTGAGCCCCCCGTGTCCTGCCCCTGG 
CACCGCAGCGTTGTCTCTGCCAAGTCCTCTCTCTCTGCCGGTGCTGGATCCGCAAGAGCAGAGGCGCTTGGCCGTGCACC 
CAGGCCTGGGGGCGCAGGGGCACCTTCGGGAGGGAGTGGGTACCGTGCAGGCCCTGGTCCTGCAGAGACGCACCCAGGTT 
ACACACGTGGTGAGTGCAGGCGGTGACCTGGCTCCTGCTGCTCTTTGGAAAGTCAAGAGTGGCGGCTCCTGGGGCCCCAG 
TGAGACCCCCAGGAGCTGTGCACAGGGCCTGCAGGGCCGAGGCGGCAGCCTCCTCCCCAGGGTGCACCTGAGCCTGCGGA 
GAGCAGGAGCTGCTGAGTGAGCTGGCCCACAGCGTTCGCTGCGGTCACGTTCCTGCGTGGGGTTGTTTGGGATCGGTGGG 
AGAATTTGGATTTGCTGAGTGCTGCTGTCTTGAACCACGGAGATGGCTAGGAGTGGGTTTCAGAGTTGATTTTTGTGAAT 
CAAACTAAAATCAGGCACAGGGGACCTGGCCTCAGCACAGGGGATTGTCCAATGTGGTCCCCCTCAAGGGCGCCCCACAG 
AGCCGGTGGGCTTGTTTTAAAGTGCGATTTGACGAGGGACGAGAAACCTTGAAAGCTGTAAAGGGAACCCTCAGAAAATG 
TGGCCGCCAGGGGTGGTTTCAGGTGCTTTGCTGGGCTGTGTTTGTGAAAACCCATTTGGACCCGCCCTCCAAGTCCACCC 
TCCAGGTCCACCCTCCAGGGCCGCCCTGGGCTGGGGGTATGCCTGGCGTTCCTTGTGCCGCAGCCCGGAGCACAGCAGGC 
TGTGCACATTTAAATCCACTAAGATTCACTCGGGGGGAGCCCAGGTCCCAAGCAACTGAGGGCTCAGGAGTCCTGAGGCT 
GCTGAGGGGACAGAGCAGACGGGGAACGCTGCTTCTGTGTGGCAAGTTCCTGAGGGTGCTGGCCAGGGAGGTGGCTCAGA 
GTGTATGTTGGGGTCCCACCGGGGGCAGA.ACTCTGTCTCTGATGAGTCGGCAGCCATGTAACAGGAAGGGGTGGCCACAG 
GGAGCTGGGAATGCACCAGGGGAGCTGCGCAGCTGGCCGAGGTCCCAGGGCCAGGCCACAGGAAGGGCAGGGGGACGCCC 
GGGGCCACAGCAGAGGCCGCAGGAAGGGAAGGGGATGCCCAGGCCAGAGCAGAGGCTACCGGGCACAGGGGGGCTCCCTG 
AGCTGGGTGAGCGAGGCTCATGACTCGGCGAGGGAACCTCCTTGACGTGAAGCTGACGACTGGTGTTGCCCAGCTCACAG 
CCCAGCCAGGTCCCGCGCCTGAGCAGGAACTCAGAACCCTCCCCTTTGTCTAAAGCACAGCAGATGCCTTCAGGGCATCT 
AGGAGAAAACAGGCAAAGTCGTTGAGAAACGTCTTAAAAGAAGGTGGGATGGTGGCAATTTCTTGTCCAGATTTTAGTCT 
GCCCCGGACCACAGATGAGTCTATAACGGGATTGTGGTGTTGCCATGGGGACACATGAGATGGACCATCACAGAGGCCAC 
TGGGGCTGCACCTCCCATCTGAGTCCTGGCTGTCCCGGGTCCAGGCCAGGTTCTTGCATGCTCACCTACCTG7CCTGCCC 
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GGGAGACAGGGAAAGCACCCCGAAGTCTGGAGCAGGGCTGGGTCCAGGCTCCTCAGAGCTCCTGCCAGGCCCAGCACCCT 
GCTCCAAATCACCACTTCTCTGGGGTTTTCCAAAGCATTTAACAAGGGTGTCAGGTTACCTCCTGGGTGACGGCCCCGCA 
TCCTGGGGCTGACATTGCCCCTCTGCCTTAG 

Intron 11 (SEQ ID NO 15) 

GTGAGCGCACCTGGCCGGAAGTGGAGCCTGTGCCCGGCTGGGGCAGGTGCTGCTGCAGGGCCGTTGCGTCCACCTCTGCT 
TCCGTGTGGGGCAGGCGACTGCCAATCCCAAAGGGTCAGAGGCCACAGGGTGCCCCTCGTCCCATCTGGGGCTGAGCAGA 
AATGCATCTTTCTGTGGGAGTGAGGGTGCTCACAACGGGAGCAGTTTTCTGTGCTATTTTGGTAAAAGGAAATGGTGCAC 
CAGACCTGGGTGCACTGAGGTGTCTTCAGAAAGCAGTCTGGATCCGAACCCAAGACGCCCGGGCCCTGCTGGGCGTGAGT 
CTCTCAAACCCGAACACAGGGGCCCTGCTGGGCATGAGTCCCTCTGAACCCGAGACCCTGGGGCCCTGCTGGGCGTGAGT 
CTCTCCGAACCCAGAGACTTCAGGGCCCTTTTGGGCGTGAGTCTCTCCGCTGTGAGCCCCACACTCCAAGGCTCATCCAC 
AGTCTACAGGATGCCATGAGTTCATGATCACGTGTGACCCATCAGGGGACAGGGCCATGGTGTGGGGGGGGTCTCTACAA 
AATTCTGGGGTCTTGTTTCCCCAGAGCCCGAGAGCTCAAGGCCCCGTCTCAGGCTCAGACACAAATGAATTGAAGATGGA 
CACAGATGCAGAAATCTGTGCTGTTTCTTTTATGAATAAAAAGTATCAACATTCCAGGCAGGGCAAGGTGGCTCACACCT 
ATAATCCCAGCACTTTGGGAGGCCGAGGTGGGTGGATCACTTGAGGCCAGGAGTTTGAGGCCAACCTAACCAACATAGTG 
AAATTCCATTTCTACTTAAAAAATACAAAAATTAGCCTGGCCTGGTGGCACACGCCTGTAGTCCCCGCTATGCGGGAGGC 
TGAGGCAGGAGAATCATTTGAACCCAGGAGGCAGAGGTTGCAGTGAGCCGAGATCACACCACTGCACTCCAGCCTGGGCA 
ACAGAGTGAGACTTCATCTTAAAAAAAAAAAAAAAAGTATCAGCATTCCAAAACCATAGTGGACAGGTGTTTTTTTATTC 
TGTCCTTCGATAATATTTACTGGTGCTGTGCTAGAGGCCGGAACTGGGGGTGCCTTCCTCTGAAAGGCACACCTTCATGG 
GAAGAGAAATAAGTGGTGAATGGTTGTTAAACCAGAGGTTTAAACTGGGGTCCTGTCGTTCTGAGTTAACAGTCCAGATC 
TGGACTTTGCCTCTTTCCAGAATGCTCCCTGGGGTTTGCTTCATGGGGGAGCAGCAGGTGTGGACACCCTCGTGATGGGG 
GAGCAGCAGGTGCAGACGCCCTCATGATGGGGGAGTGGCAGGTGCAGACACCCTTGTGCATGGTGCCCAGCATGTCCCTG 
TTGCAGCTCCCTCCCCACAAGGATGCCGGTCTCCTGTGCTCCCCACAGTCCCTGCTTCCCTCTCACAGCCTTACCTGGTC 
CTGGCCTCCACTGGCTTTGTCTGCATGATTTCCACATTTCCTGGGCTCCCAGCACCTCTTCGCCTCTCCCAGGCACCTCT 
GCAGTGCTGGCCATACCAGTCAGCTGTGAACTGTCCACTGCTTATTTTGCTCCCCATGAAATGTATTTTTTAGGACAGGC 
ACCCCTGGTTCCAGCCTCTGGCACAGCATCAGTGAATGTTATTGAAGGACAAAGGACAGACAAACAAATCAGGAAAATGG 
GTTCTCTCTAAACACATTGCAAAGCCACAGAGGCTAGTGCAGGATGGGTGGGCATCAGGTCATCAGATGTGGGTCCAATG 
CCAGAATATTCTGTGCTCCCAAAGGCCACTTGGTCAGAGTGTGTGCTTGCAGAGGTGGCTCTAAAAGCTCAGCAGTGGAG 
GCAGTGGTTCGCCATACTCAGGGTGAACTCACATCCTCTGTGTCTGAAGTATACAGCAGAGGCTTGAAGGGCATCTGGGA 
GAAGAAAACAGGCAAAATGATTAAGAAAAGTGAAAAAGGAAAAGTGGTAAGATGGGAATTTTCTTGTCCAGATTTTAGTC 
TCCCAAACCACAGCTCAGATGGTAGAATGTGGTCAGAACTGATGGACAGAACAATAGAACAAAACGGAAGCCCTATCTCT 
CAGAAACGTGTGTTAATGTGGTATGTGGCACAGCTGATGGAAAAGAGAGTGTGTGTGTAATTTTTTTTTCTGAGAAAACT 
GACTGGAAGCAAATAAGTTGTGTCTTTACAGCATATACCAGAGCAGATTCTAGGTAGAAGAGGAGACACATGCAAACAAC 
ACCAGCAACAGAAATAAAACAAAAGACTCAAAGGGAAGGGAGGTGAACGTTCCCTGGTTTGGTGTTGGGGAAGGACACAC 
AGGGAGGCGGATGAAACCAGTGAGGCAACGGGCATTGCTTTCACTGCAGAGAAACTCAGCTTGCCTGAGCCACAGTGAAA 
ATGGCCATTCCCTGGAGCGTTTGTGCACGTGATTTATTTAAGGCGCCCTGTGAGGTCCTGCACATTCATCCTCTCACTTT 
GTTCTCCTAACCACCTGAGAGGTAGAGGAGGAAAGGCTCCAGGGGAGCAGCCGCCCTTGGTCACCCAGCTGGCAAAGGGC 
ATGCATGATTGCAGCCTGGCCTCCTGCTCCGGGGCCCTTGCTCTGCCCGAGGACCCCACACAAGTCAGACCCATAGGCTC 
AGGGTGAGCCGGAGCCCAAGGTCGTGTTGGGGATGGCTGTGAAAGAAGAAATGGACGTCTGATGCACACTTGGGAAGGTC 
CTACCAGCAGCGTCAAAGAAATGCATGTGAAACTGACAGCGAGACCCATCCCTCA-AAGAAACGCACGTGAAACTGATGGC 
GAGACCTGTCCCCATCCCTCATGCTGGCTCCTTTTCTGGGCTTGCCAAGAGCCAGCATCAGGTTGAGGCAAGCTGGAAAG 
ACTTTTCTGGAAAGCAGCTTGTTTGCATGGAAGTCCTCACAATGTCCTGTGTCTTCCCAGTAATTCCACTTCTGAAGTGA 
CCAGACATTATCACGGGTCTTATTTACCATTTCCAGTGTTCCAGGCAGGGGGACTTGCCACAGCAAGTCACGAACCTGCC 
CAJv^TACAGGGCTAAGGAGATATTATGCATCACAAAACTTGCTCTGCCATTAAACATTTTTCAAAGAATTTTTGAAGAAT 
GTTTAATGGCACAAAACGTTTATTTCAATGTAGCAGTGTTCAAAGCTGGATGTAAAAGAACACACCCCAGGAGCCTGCCG 
TG.aJ\TG7CATGTGTGTTCATCTTTGGACATGGACATACATGGGCAGTGAGTGGTGGTGAGGCCCTGGAGGACATCGGTGG 
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GA7GCC7CCA7CC7GCCCC7C7GGAGACACCA7G7G7GCCACG7GCAC7CAC7GGAGCCC7G777AGC7GG7GCCACC7G 
GCTCTTCCATCCCTGAGATTCAAACACAGTGAGATTCCCCACGCCCAACTCAGTGTTCTCCCACAAAAAACCTGAGTCAC 
ACCTGTGTTCACTCGAGGGACGCCCGGGAGCCAGGGCTCCACAGTTTATTATGTGTTTTTGGCTGAGTTATGTGCAGATC 
TCATCAGGGCAGATGATGAGTGCACAAACACGGCCGTGCGAGGTTTGGATACACTCAACATCACTAGCCAGGTCCTGGTG 
5 GAGTTTGGTCATGCAGAGTCTGGATGGCATGTAGCATTTGGAGTCCATGGAGTGAGCACCCAGCCCCCTCGGGCTGCAGC 
GCATGCCCCAGGCAGGACAAGGAAGCGGGAGGAAGGCAGGAGGCTCTTTGGAGCAAGCTTTGCAGGAGGGGGCTGGGTGT 
GGGGCAGGCACCTGTGTCTGACATTCCCCCCTGTGTCTCAG 

Intron 12 (SEQ ID NO 16) 

10 GTGAGCAGGCTGATGGTCAGCACAGAGTTCAGAGTTCAGGAGGTGTGTGCGCAAGTATGTGTGTGTGTGTGTGCGCGCGT 
GCCTGCAAGGCTGATGGTGACTGGCTGCACGTAAGAGTGCACATGTACGCATATACACGTGAGCACATACATGTGTGCAT 
GTGTGTACATGAAGGCATGGCAGTGTGTGCACAGGTGTGCAAGGGCACAAGTGTGTGCACATGCGAATGCACACCTGACA 
TGCATGTGTGTTCGTGCACAGTCGTGTGGGCATTCACGTGAGGTGCATGCGTGTGGGTGTGCAGTGTGAGTAGCATGTGT 
GCACATAACATGTATTGAGGGGTCCTCGTGTTCACCCCGCTAGGTCCTCAGCACCAGTGCCACTCCTTACAGGATGAGAC 

15 GGGGTCCCAGGCCTTGGTGGGCTGAGGCTCTGAAGCTGCAGCCCTGAGGGCATTGTCCCATCTGGGCATCCGCGTCCACT 
CCCTCTCCTGTGGGCTTCTGTGTCCACTCCCCCTCTCCTGTGGGCATTTACATCCACTCCACTCCCTCTCTCCTGTGGGC 
ATCCGCGTCCACTCCCCCTCTCTGTGGGCATCTGCGTCCACCTCCCCTCTCTGTGGGCATTTGCGTCCACTCCCTCTCCT 
GGTTCCTTCCTGTCTTGGCCGAGCCTCGGGGGCAGGCAGATGACACAGAGTCTTGACTCGCCCAGGGTGGTTCGCAGCTG 
CCGGGTGAGGGCCAGGCCGGATTTCACTGGGAAGAGGGATAGTTTCTTGTCAAAATGTTCCTCTTTCTTGTTCCATCTGA 

20 ATGGATGATAAAGCAAAAAGTAAAAACTTAAAATCCCAGAGAGGTTTCTACCGTTTCTCACTCTTTCTTGGCGACTCTAG 

Intron 13 (SEQ ID NO 17) 

GTGAGCCGCCACCAAGGGGTGCAGGCCCAGCCTCCAGGGACCCTCCGCGCTCTGCTCACCTCTGACCCGGGGCTTCACCT 
TGGAACTCCTGGGTTTTAGGGGCAAGGAATGTCTTACGTTTTCAGTGGTGCTGCTGCCTGTGCACAGTTCTGTTCGCGTG 

25 GCTCTGTGCAAAGCACCTGTTCTCCATCTCTGGGTAGTGGTAGGAGCCGGTGTGGCCCCAGGTGTCCCCACTGTGCCTGT 
GCACTGGCCGTGGGACGTCATGGAGGCCATCCCAGGGCAGCAGGGGCATGGGGTAAAGAGATGTTTATGGGGAGTCTTAG 
CAGAGGAGGCTGGGAAGGTGTCTGAACAGTAGATGGGAGATCAGATGCCCGGAGGATTTGGGGTCTCAGCAAAGAGGGCC 
GAGGTGGGTGCAGGTGAGGGTCGCTGGCCCCACCCCCGGGAAGGTGCAGCAGAGCTGTGGCTCCCCACACAGCCCGGCCA 
GCACCTGTGCTCTGGGCATGGCTGTGCTCCTGGAACGTTCCCTGTCCTGGCTGGTCAGGGGGTGCCCCTGCCAAGAATCG 

30 ACAACTTTATCACAGAGGGAAGGGCCAATCTGTGGAGGCCACAGGGCCAGCTTCTGCCTGGAGTCAGGGCAGGTGGTGGC 
ACAAGCCTCGGGGCTGTACCAAAGGGCAGTCGGGCACCACAGGCCCGGGCCTCCACCTCAACAGGCCTCCCGAGCCACTG 
GGAGCTGAATGCCAGGAGGCCGAAGCCCTCGCCCCATGAGGGCTGAGAAGGAGTGTGAGCATTTGTGTTACCCAGGGCCG 
AGGCTGCGCGAATTACCGTGCACACTTGATGTGAAATGAGGTCGTCGTCTATCGTGGAAACCCAGCAAGGGCTCACGGGA 
GAG777TCCAT7ACAAGGTCG7ACCA7GAAAATGGTTT7TAACCCGAGTGCT7GCGCC77CATGCTCTGGCAGGGAGGGC 

35 AGAGCCACAGCTGCATGTTACCGCCTTTGCACCAGCTCCAGAGGCTTGGGACCAGGCTGTCTCAGTTCCAGGGTGCGTCC 
GGCTCAGACCGCCCTCCTCTCTGCCTTCTCTCTCTGCCTCAAATCTTCCCTCGTTTGCATCTCCCTGACGCGTGCCTGGG 
CCCTCGTGCAAGCTGCTTGACTCCTTTCCGGAAACCCTTGGGGTGTGCTGGATACAGGTGCCACTGAGGACTGGAGGTGT 
CTGACACTGTGGTTGACCCCAGGGTCCAGCTGGCGTGCTTGGGGCCTCCTTGGGCCATGATGAGGTCAGAGGAGTTTTCC 
CAGGTGAAAACTCCTGGGAAACTCCCAGGGCCATGTGACCTGCCACCTGCTCCTCCCATATTCAGCTCAGTCTTGTCCTC 

40 ATTTCCCCACCAGGGTCTCTAGCTCCGAGGAGCTCCCGTAGAGGGCCTGGGCTCAGGGCAGGGCGGCTGAGTTTCCCCAC 
CCATGTGGGGACCCTTGGGTAGTCGCTTGATTGGGTAGCCCTGAGGAGGCCGAGA7GCGATGGGCCACGGGCCGTTTCCA 
AACACAGAG7 JAGGCACG7GGAAGGCCCAGGAA7CCCC77CCC7CGAGGCAGGAG7GGGAGAACGGAGAGC7GGGCCCCG 
A777CACGGCAGCCAGGC7GCAGTGGGCGAGGC7G7GG7GG7CCACG7GGCGC7GGGGGCGGGG7C7GA77CAAATCCGC 
7GGGGC7CGG JCT7CC7GGCCCG7GC7GGCCGCGCC7CCACACGGGC77GGGG7GGACGCCCCGACC7C7AGCAGG7GGC 

45 7A777C7CCC777GGAAGAGAGCCCC7CACCCA7GC7AGGTG777CCC7CC7GGG7CAGGAGCG7GGCCG7G7GGCAACC 
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CCGGGACCTTAGGCTTATTTATTTGTTTAAAAACATTCTGGGCCTGGCTTCCGTTGTTGCTAAATGGGGAAAAGACATCC 
CACCTCAGCAGAGTTACTGAGAGGCTGAAACCGGGGTGCTGGCTTGACTGGTGTGATCTCAGGTCATTCCAGAAGTGGCT 
CAGGAAGTCAGTGAGACCAGGTACATGGGGGGCTCAGGCAGTGGGTGAGATGAGGTACACGGGGGGCTCAGGCAGTGGGT 
GAGGCCAGGTACATGGGGGGCTCAGGCACTGGGTGAGATGAGGTACACGGGGGGCTCAGGCAGAGGGTCAGACCAGGTAC 
ACGGGGGCTCTGATCACACGCACATATGAGCACATGTGCACATGTGCTGTTTCATGGTAGCCAGGTCTGTGCACACCTGC 
CCCAAAGTCCCAGGAAGCTGAGAGGCCAAAGATGGAGGCTGACAGGGCTGGCGCGGTGGCTCACACCTGTAGTCCCAGCA 
CTTTGGGAGGCCGAGGCGAGAGGATCCCTTGAGCCCAGGAGTTTAAGACCAGCCTGAGCAACATAGTAGAACCCCATCTC 
TATGAAAAATAAAAACAAAAATTAGCTGAACATGGTGGTGTGCGCCTGTAGTTCCAATACTTGGGAGGCTGAAGTGGGAG 
GATCACTTGAGCCCAGGAGGTGGAAGCTGCAGTGAGCTGAGATTGCACCACTGTACTGCAGCCTGGGTGACAGAGTGAGA 
GCCCATCTCAACAACAACAAAGAAGACTGACAAATGCAGTTTCTTGGAAAGAAACATTTAGTAGGAACTTAACCTACACA 
CAGAAGCCAAGTCGGTGTCTCGGTGTCAGTGAGATGAGATGATGGGTCCTCACACCATCACCCCAGACCCAGGGTTTATG 
CACCACAGGGGCGGGTGGCTCAGAAGGGATGCGCAGGACGTTGATATACGATGACATCAAGGTTGTCTGACGAAGGGCAG 
GATTCATGATAAGTACCTGCTGGTACACAAGGAACAATGGATAAACTGGAAACCTTAGAGGCCTTCCCGGAACAGGGGCT 
AATCAGAAGCCAGCATGGGGGGCTGGCATCCAGGATGGAGCTGCTTCAGCCTCCACATGCGTGTTCATACAGATGGTGCA 
CAGAAACGCAGTGTACCTGTGCACACACAGACACGCAGCTACTCGCACACACAAGCACACACACAGACATGCATGCATGC 
ATCCGTGTGTGTGCACCTGTGCCCATGAGGAAACCCATGCATGTGCATTCATGCACGCACACAGGCACCGGTGGGCCCAT 
GCCCACACCCACGAGCACCGTCTGATTAGGAGGCCTTTCCTCTGACGCTGTCCGCCATCCTCTCAG 

Intron 14 (WEQ ID NO 18) 

GTATGTGCAGGTGCCTGGCCTCAGTGGCAGCAGTGCCTGCCTGCTGGTGTTAGTGTGTCAGGAGACTGAGTGAATCTGGG 
CTTAGGAAGTTCTTACCCCTTTTCGCATCAGGAAGTGGTTTAACCCAACCACTGTCAGGCTCGTCTGCCCGCCCTCTCGT 
GGGGTGAGCAGAGCACCTGATGGAAGGGACAGGAGCTGTCTGGGAGCTGCCATCCTTCCCACCTTGCTCTGCCTGGGGAA 
GCGCTGGGGGGCCTGGTCTCTCCTGTTTGCCCCATGGTGGGATTTGGGGGGCCTGGCCTCTCCTGTTTGCCCTGTGGTGG 
GATTGGGCTGTCTCCCGTCCATGGCACTTAGGGCCCTTGTGCAAACCCAGGCCAAGGGCTTAGGAGGAGGCCAGGCCCAG 
GCTACCCCACCCCTCTCAGGAGCAGAGGCCGCGTATCACCACGACAGAGCCCCGCGCCGTCCTCTGCTTCCCAGTCACCG 
TCCTCTGCCCCTGGACACTTTGTCCAGCATCAGGGAGGTTTCTGATCCGTCTGAAATTCAAGCCATGTCGAACCTGCGGT 
CCTGAGCTTAACAGCTTCTACTTTCTGTTCTTTCTGTGTTGTGGAAATTTCACCTGGAGAAGCCGAAGAAAACATTTCTG 
TCGTGACTCCTGCGGTGCTTGGGTCGGGACAGCCAGAGATGGAGCCACCCCGCAGACCGTCGGGTGTGGGCAGCTTTCCG 
GTGTCTCCTGGGAGGGGAGCTGGGCTGGGCCTGTGACTCCTCAGCCTCTGTTTTCCCCCAG 

Intron 15 (SEQ ID NO 19) 

GCAAGTGTGGGTGGAGGCCAGTGCGGGCCCCACCTGCCCAGGGGTCATCCTTGAACGCCCTGTGTGGGGCGAGCAGCCTC 
AGATGCTGCTGAAGTGCAGACGCCCCCGGGCCTGACCCTGGGGGCCTGGAGCCACGCTGGCAGCCCTATGTGATTAAACG 
CTGGTGTCCCCAGGCCACGGAGCCTGGCAGGGTCCCCAACTTCTTGAACCCCTGCTTCCCATCTCAGGGGCGATGGCTCC 
CCACGCTTGGGAGCCTTCTGACCCCTGACCTGTGTCCTCTCACAGCCTCTTCCCTGGCTGCTGCCCTGAGCTCCTGGGGT 
CCTGAGCAAGTTCTCTCCCCGCCCCGCCGCTCCAGCGTCACTGGGCTGCCTGTCTGCTCGCCCCGGTGGAGGGGTGTCTG 
TCCCTTCACTGAGGTTCCCACCAGCCAGGGCCACGAGGTGCAGGCCCTGCCTGCCCGGCCACCCACACGTCCTAGGAGGG 
TTGGAGGATGCCACCTCTGGCCTCTTCTGGAACGGAGTCTGATTTTGGCCCCGCAG 

3 y -un transcribed region (SEQ ID NO 20) 

ATCTCATGTTTGAATCCTAATGTGCACTGCATAGACACCACTGTATGCAATTACAGAAGCCTGTGAGTGAACGGGGTGGT 
GGTCAGTGCGGGCCCATGGCCTGGCTGTGCATTTACGGAAGTCTATGAGTGAATGGGGTTGTGGTCAGTGCGGGCCCATG 
GCCTGGCTGGGCCTGGGAGGTTTCTGATGCTGTGAGGCAGGAGGGGAAGGAGGGTAGGGGATAGACAGTGGGAGCCCCCA 
CCCTGGAAGACATAACAGTAAGTCCAGGCCCGAAGGGCAGCAGGGATGCTGGGGGCCCAGCTTGGGCGGCGGGGATGATG 
GAGGGCCTGGCCAGGGTGGCAGGGATGATGGGGGCCCCA.GCTGGGGTGGCAGGGGTGATGGGGGGGGCTGGTCTGGGTGG 
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CGGGGAAGATGGGGAAGCCTGGCTGGGCCCCCTCCTCCCCTGCCTCCCACCTGCAGCCGTGGATCCGGATGTGCTTCCCT 
GGTGCACATCCTCTGGGCCATCAGCTTTCATGGAGGTGGGGGGCAGGGGCATGACACCATCCTGTATAAAATCCAGGATT 
CCTCCTCCTGAACGCCCCAACTCAGGTTGAAAGTCACATTCCGCCTCTGGCCATTCTCTT7VAGAGTAGACCAGGATTCTG 
ATCTCTGAAGGGTGGGTAGGGTGGGGCAGTGGAGGGTGTGGACACAGGAGGCTTCAGGGTGGGGCTGGTGATGCTCTCTC 
ATCCTCTTATCATCTCCCAGTCTCATCTCTCATCCTCTTATCATCTCCCAGTCTCATCTGTCTTCCTCTTATCTCCCAGT 
CTCATCTGTCATCCTCTTACCATCTCCCAGTCTCATCTCTTATCCTCTTATCTCCTAGTCTCATCCAGACTTACCTCCCA 
GGGCGGGTGCCAGGCTCGCAGTGGAGCTGGACATACGTCCTTCCTCAGGCAGAAGGAACTGGAAGGATTGCAGAGAACAG 
GAGGGGCGGCTCAGAGGGACGCAGTCTTGGGGTGAAGAAACAGCCCCTCCTCAGAAGTTGGCTTGGGCCACACGAAACCG 
AGGGCCCTGCGTGAGTGGCTCCAGAGCCTTCCAGCAGGTCCCTGGTGGGGCCTTATGGTATGGCCGGGTCCTACTGAGTG 
CACCTTGGACAGGGCTTCTGGTTTGAGTGCAGCCCGGACGTGCCTGGTGTCGGGGTGGGGGCTTATGGCCACTGGATATG 
GCGTCATTTATTGCTGCTGCTTCAGAGAATGTCTGAGTGACCGAGCCTAATGTGTATGGTGGGCCCAAGTCCACAGACTG 
TGTCGTAAATGCACTCTGGTGCCTGGAGCCCCCGTATAGGAGCTGTGAGGAAGGAGGGGCTCTTGGCAGCCGGCCTGGGG 
GCGCCTTTGCCCTGCAAACTGGAAGGGAGCGGCCCCGGGCGCCGTGGGCGGACGACCTCAAGTGAGAGGTTGGACAGAAC 
AGGGCGGGGACTTCCCAGGAGCAGAGGCCGCTGCTCAGGCACACCTGGGTTTGAATCACAGACCAACaGGTCAGGCCATT 
GTTCAGCTATCCATCTTCTACAAAGCTCCAGATTCCTGTTTCTCCGGGTGTTTTTTGTTGAAATTTTACTCAGGATTACT 
TATATTTTTTGCTAAAGTATTAGACCCTTAAAAAAGGTATTTGCTTTGATATGGCTTAACTCACTAAGCACCTACTTTAT 
TTGTCTGTTTTTATTTATTATTATTATTATTATTAGAGATGGTGTCTACTCTGTCACCCAGGTTGTTAGTGCAGTGGCAC 
AGTCATGGCTCGCTGTAGCCGCAAACCCCCAGGCTCAAGTGATCCTCCGGCCTCAGCTTCCCAGAGTGCTGGGATTACAG 
GTGTGAGCCACTGCCCTTGCCTGGCACTTTTAAAAACCACTATGTAAGGTCAGGTCCAGTGGCTTCCACACCTGTCATCC 
CAGTAGTTTGGGAAGCCGAGGCAGAAGGATTGTCTGAGGCCAGGAGTTTGAGACCAGCATGGGTAACATAGGGAGACCCC 
ATCTCTACAAAAAATGCAAAAAGTTATCCGGGCGTGGGGTCCAGCATCTGTAGTCCCAGCTGCTCGGGAGGCTGAGTGGG 
AGGATCGCTTGAGCCCGGGAGGTCATGGCTGCAGTGAGCTGTGATTGTACCATCGCACTCCAGCCTGGGCAACAGAGTGA 
G AC CCT GT CT C AAAAAAAAAAAAAAAAAAAG AAG G AG AAG G AG AAG AG AAG AAGAAG G AAG AAG G AAAG AG AAG AAGAAG 
GAAGAAGGAAGAAAGAAGGAGAAGGAGGCCTGCTAGGTGCTAGGTAGACTGTC7VAATCTCAGAGCAAAATGAAAATAACA 
AAGTTTTAAAGGGAAAGAAAAACCCCAGCTCTTTGGACTTCCTTAGGCCTGAACTTCATCTCAAGCAGCTTCCTTCCACA 
GAC^GCGTGTATGGAGCGAGTGAGTTCAAAGCAGAAAGGGAGGAGAAGCAGGCAAGGGTGGAGGCTGTGGGTGACACCA 
GCCAGGACCCCTGAAAGGGAGTGGTTGTTTTCCTGCCTCAGCCCCACGCTCCTGCCGGTCCTGCACCTGCTGTAACCGTC 
GATGTTGGTGCCAGGTGCCCACCTGGGAAGGATGCTGTGCAGGGGGCTTGCCAAACTTTGGTGGGTTTCAGAAGCCCCAG 
GCACTTGTGGCAGGCACAATTACAGCCCCTCCCCAAAGATGCCCACGTCCTTCTCCTGGAACCTGTGAATGTGTCACCCG 
CAAGGCAGAGGCTGGTGAAGGCTGCAGGTGGAATCACGGCTGCCAGTCAGCCGATCTTAAGGTCATCCTGGATTATCTGG 
TGGGCCTGATATGGCCACAAGGGTCCCTAGAAGTGAGAGAGGGAGGCAGGGGAGAGTCAGAGAGGGGACGTGAGAAGGAC 
CACTGGCCACTGCTGGCTTTGAGATGGAGGAGGGGGTCCCCAGCCAAGGAATGGGGGCAGCCGCTCCATGCTGGAAAAGC 
AAGCAATCCTCCCCGGTCCTGAGGGCACACGGCCCTGCCCACGCCTCGATTTCAGGCCAGTGGGACCTGTTTCAGCTTTC 
CGGCCTCCAGAGCTGTAAGATGATGCGTTTGTGTTCAGCCACTAAGCTGCAGTGATTCGTCACAGCAGCAAATGGAATAG 
CAGTACAGGGAAATGAATACAGGGACAGTTCTCAGAGTGACTCTCAGCCCACCCCTGGG 
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Characterization of the exons showed, interestingly, that the functionally important 
hTC protein domains which are described in our Patent Application 
PCT/EP/98/03469 are arranged on separate exons. The telomerase-characteristic T 
motif is located on exon 3. The RT (reverse transcriptase) motifs 1-7, which are 
5 important for the catalytic function of the telomerase, are located on the following 
exons: RT motifs 1 and 2 on exon 4, RT motif 4 on exon 9, RT motif 5 on exon 10, 
and RT motifs 6 and 7 on exon 1 1 . RT motif 3 is shared by exons 5 and 6 (see 
Fig. 8). 

10 Elucidation of the exon-intron structure of the hTC gene also shows that the four 
deletions or insertion variants of the hTC cDNA which were described in our Patent 
Application PCT/EP/98/03469, as well as three additional hTC insertion variants 
which are described in the literature (Kilian et al., 1997), in all probability represent 
alternative splicing products. As shown in Fig. 8, the splicing variants can be divided 

15 into two groups: deletion variants and insertion variants. 

The hTC variants in the deletion group lack specific sequence segments. The 36 bp 
in-frame deletion in variant DELI in all probability results from using an alternative 
3' splice acceptor sequence in exon 6, resulting in a part of RT motif 3 being lost. In 
20 variant DEL2, the normal 5' splice donor and 3 4 splice acceptor sequences of introns 

6, 7 and 8 are not used. Instead exon 6 is fused directly to exon 9, resulting in a 
displacement arising in the open reading frame and a stop codon appearing in exon 
10. Variant Del3 is a combination of variants 1 and 2. 

25 The insertion variant group is characterized by the insertion of intron sequences 

which lead to premature cessation of translation. Instead of the 5 ! splice donor 
sequence of intron 5, which is normally used, use is made, in variant INS1, of an 
alternative, 3'-located splice site, resulting in the insertion of the first 38 bp from 
intron 4 between exon 4 and exon 5. The insertion, in variant INS2, of a region of the 

30 intron 1 1 sequence likewise results from using an alternative 5' splice donor 

sequence in intron 11. Since this variant was only described inadequately in the 
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literature (Kilian et aL, 1997), it is not possible to determine the precise alternative 5' 
splice donor sequence in this variant. The insertion of intron 14 sequences between 
exon 14 and exon 15 in variant ENS3 comes from using an alternative 3' splice 
acceptor sequence, resulting in the 3' part of intron 14 not being spliced. 

5 

The hTC variant INS4 (variante 4), which is described in our Patent Application 
PCT/EP/98/03469, is characterized by exon 15, and the 5 f part region of exon 16, 
being replaced by the first 600 bp of intron 14. This variant can be attributed to the 
use of an alternative internal 5' splice donor sequence in intron 14 and an alternative 
10 3' splice acceptor sequence in exon 16, resulting in an altered C terminus. 

The in vivo generation of hTC protein variants which are probably non-functional 
and which could interfere with the function of the complete hTC protein constitutes a 
possible mechanism, in addition to transcription regulation, for controlling hTC 
15 protein function. The function of the hTC splicing variants is not yet known. 

Although most of these variants presumably encode proteins without reverse 
transcriptase activity, they could nevertheless play a crucial role as transdominant- 
negative telomerase regulators by, for example, competing for interaction with 
important binding partners. 

20 

The search for possible transcription factor binding sites was carried out using the 
„find pattern" algorithm from the Genetics Computer Group (Madison, USA) GCG 
Sequence Analysis program package. This resulted in the identification of a variety 
of potential binding sites for transcription factors in the nucleotide sequence of intron 
25 2, which binding sites are listed in Tab. 2. In addition, an Spl binding site was found 

in intron 1 (pos. 43), and a c-Myc binding site was found in the 5'-untranslated region 
(cDNA position 29-34, cf. Fig. 6). 



Le A 32 80" ^m^— Countries 



- 50- 



Example 6 

In order to ascertain the start point(s) of hTC transcription in HL 60 cells, the 5' end 
of the hTC mRNA was determined by means of primer extension analysis. 

5 

2 of polyA + RNA from HL-60 cells were denaturated at 65°C for 10 min. 1 ^1 of 
RNasin (30-40 U/ml) and 0.3-1 pmol of radioactively labelled primer 
(5 GTTAAGTTGTAGCTTAC ACTGGTTCTC 3'; 2.5-8x1 0 5 cpm) were added for 
primer annealing, and the whole was incubated, at 37°C for 30 min, in a total volume 

10 of 20 |il. After the addition of 10 |il of 5xreverse transcriptase buffer (from Gibco- 
BRL), 2 jil of 10 mM dNTPs, 2 (il RNasin (see above), 5 |il of 0.1 M DTT (from 
Gibco-BRL) 2 |al of ThermoScnpt RT (15 U/jil; from Gibco-BRL) and 9 ^1 of 
DEPC-treated water, primer extension took place, at 58°C for 1 h, in a total volume 
[lacuna]. The reaction was stopped by adding 4 |il of 0.5 M EDTA, pH 8.0, and the 

15 RNA was degraded, at 37°C for 30 min, after having added 1 |il of RNaseA 
(10 mg/ml). 2.5 jig of sheared calf thymus DNA and 100 fil of TE were then added, 
and the mixture was extracted once with 150 |il of phenol/chloroform (1:1). The 
DNA was precipitated, at -70°C for 45 min, after adding 15 |il of 3 M Na acetate and 
450 jal of ethanol, and then centrifuged at 14,000 rpm for 15 min. The precipitate was 

20 washed once with 70% ethanol, dried in air and dissolved in 8 jal of sequencing stop 
solution. After 5 min of denaturation at 80°C, the samples were loaded onto a 6% 
polyacrylamide gel and fractionated electrophoretically (Ausubel et al., 1987) 
(Fig- 5). 

25 In this connection, a main transcription start site was identified which is located 

1767 bp 5' of the ATG start codon of the hTC cDNA sequence (nucleotide position 
3346 in Fig. 4). In addition to this, the nucleotide sequence around this main 
transcription start (TTA+jTTGT) represents an initiator element (Inr), which, in 6 out 
of 7 nucleotides, matches the consensus motif (PyPyA +l Na/tPyPy) (Smale, 1997) of 

30 an initiator element. 
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It was not possible to identify any unambiguous TATA box in the immediate vicinity 
of the experimentally identified main transcription start, which means that the hTC 
promoter has probably to be classified in the family of TATA-less promoters (Smale, 
1997). However, a potential TATA box from nucleotide position 1306 to nucleotide 
5 position 1311 (Fig. 4) was found by means of bioinformatics analysis. The subsidiary 
transcription starts which were additionally observed around the main transcription 
start have also been described in the case of other TATA-less promoters (Geng and 
Johnson, 1993), for example in the strongly regulated promoters of some cell cycle 
genes (Wick et al. 9 1995). 

10 

Example 7 

In addition to the start point of the hTC transcript which was described in Example 6 
and identified in HL60 cells, a further transcription start region was also identified in 
15 HL60 cells. With the aid of RT-PCR analyses, the region of the hTC gene 
transcription start in HL60 cells was localized to bp -60 to bp -105. 

The cDNA for this was synthesized using a First Strand cDNA Synthesis kit 
(Clontech), in accordance with the manufacturer's instructions, and employing 0.4 jag 

20 of HL60 cell polyA RNA (Clontech) and the gene-specific primer GSP13 
(5 ' -CCTCC AAAG AGGTGGCTTCTTCGGC-3 ' , cDNA position 920-897). In a final 
volume of 50 jal, 10 pmol dNTP mix were added to 1 of cDNA, and a PCR 
reaction was carried out in lxPCR reaction buffer F (PCR-Optimizer kit from 
InVitrogen) and using one unit of platinum Taq DNA polymerase (from Gibco/BRL). 

25 10 pmol of each of the 5' and 3' primers defined below were added as primers. The 

PCR was carried out in 3 steps. A two-minute denaturation at 94°C was followed by 
36 PCR cycles in which the DNA was first of all denatured at 94°C for 45 sec and, 
after that, the primers were annealed, and the DNA chain was extended at 68°C for 
5 min. The cycles were concluded by a chain extension at 68°C for 10 min. In all, six 

30 different 5' PCR primers (primer HTRT5B: 

5'-CGCAGCCACTACCGCGAGGTGC-3\ cDNA position 105 to 126; primer CSS: 



Le A 32 805-B^ign Countries 



-52- 

5 ' -CTGCGTCCTGCTGCGC ACGTGGG AAGC-3 4 , 5 '-flanking region -49 to -23; 
primer PRO-TEST1 : 5 '-CTCGCGGCGCG AGTTTC AGGC AG-3 4 , 5'-flanking 
region -74 to -52; primer PRO-TEST2: 5'-CCAGCCCCTCCCCTTCCTTTCC-3\ 
5'-flanking region -112 to -91; primer PRO-TEST4: 
5 5'-CCAGCTCCGCCTCCTCCGCGC-3\ 5'-flanking region -191 to -171; primer 
RP-3A: S'-CTAGGCCGATTCGACCTCTCTCCO', 5'-flanking region -427 to 
-405) were combined with the 3 4 PCR primer CSRback 
(5'-GTCCCAGGGCACGCACACCAG-3\ cDNA position 245 to 225). Genomic 
DNA was also employed for the PCR, as a control, in addition to the Oligo dT- and 
10 GSP13-primed cDNAs. As Fig. 9 shows, a PCR product was only obtained with the 
primer combinations HTRT5B-C5Rback, C5S-C5Rback and PRO-TEST l-C5Rback, 
indicating that the start point for hTC transcription lies in the region between bp-60 
and bp- 105. 

15 Example 8 

Several extremely GC-rich regions, so-called CpG Islands, are located in the isolated 
5'-flanking region, of about 1 1.2 kb in size, of the hTC gene. One CpG Island, having 
a GC content of > 70%, extends from bp - 1214 into intron 2. Two further GC-rich 
20 regions having a GC content of > 60% extend from bp -3872 to bp -31 13 and from 
bp -5363 to bp —3941, respectively. The positions of the CpG Islands are shown 
graphically in Fig. 1 1 . 

The search for possible transcription factor binding sites was carried out using the 
25 "Find Pattern" algorithm from the Genetics Computer Group (Madison, USA) GCG 

Sequence Analysis program package. This resulted in the identification of a variety 
of potential binding sites in the region up to -900 bp upstream of the translation start 
codon ATG: five Spl binding sites, one c-Myc binding site, and one CCAC box 
(Fig. 10). In addition, a CCAAT box and a second c-Myc binding site were found at 
30 positions -1788 and -3995, respectively, of the 5'-flanking region. 
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Example 9 

In order to analyse the activity of the hTC promoter, PCR amplification was used to 
generate four hTC promoter sequence segments of differing length, which segments 
5 were cloned into the Promega vector pGL2 5* in front of the luciferase reporter gene. 
The 8.5 kb SacI fragment which was subcloned from phage clone P12 was selected 
as the DNA source for the PCR amplification. In a final volume of 50 jal, 10 pmol of 
dNTP mix were added to 35 ng of this DNA, and a PCR reaction was carried out in 
lxPCR reaction buffer (PCR-Optimizer kit from InVitrogen) and using one unit of 

10 platinum Taq DNA polymerase (from Gibco/BRL). In each case 20 pmol of the 5' 
and 3 ? primers which are defined below were added as primers. The PCR was carried 
out in three steps. A two-minute denaturation at 94°C was followeed by 30 PCR 
cycles in which the DNA was first of all denaturated at 94°C for 45 sec, after which 
the primers were annealed, and the DNA chain was extended, at 68°C for 5 min. The 

15 cycles were concluded by a chain extension at 68°C for 10 min. The selected V PCR 
primer was in each case the primer PK-3 A 

(5'-GCAAGCTTGACGCAGCGCTGCCTGAAACTCG-3\ position -A3 to -65), 
which primer recognizes a sequence region 42 bp upstream of the ATG START 
codon. A promoter fragment of 405 1 bp in size (NPK8) was amplified by combining 

20 the PK-3 A primers with the 5' PCR primer PK-5B 
(5 t -CCAGATCTCTGGAACACAGAGTGGCAGTTTCC-3 t , position -4093 to 
-4070). Combining the pair of primers PK-3A and PK-5C 
(5 ' -CC AG ATCTGC ATG A AGTGTGTGGGG ATTTGC AG-3 4 , position -3120 to 
-3096) led to the amplification of a promoter fragment of 3078 bp in size (NPK15). 

25 Use of the primer combination PK-3A and PK-5D 

(5'-GGAGATCTGATCTTGGCTTACTGCAGCCTCTG-3\ position -2110 to 
-2087) amplified a promoter fragment of 2068 bp in size (NPK22). Finally, using the 
primer combination PK-3A and PK-5E 

(5'-GGAGATCTGTCTGGATTCCTGGGAAGTCCTCA-3\ position -1 125 to 

30 -1 102) led to the amplification of a promoter fragment of 1083 bp in size (NPK27). 
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The PK-3A primer contains a Hindlll recognition sequence. The different 5 4 primers 
contain a Bglll recognition sequence. 

The resulting PCR products were purified using the Qiagen QIA quick spin PCR 
5 purification kit, in accordance with the manufacturer's instructions, and then digested 
with the restriction enzymes Bglll and HindEtl. The pGL2 promoter vector was 
digested with the same restriction enzymes, and the SV40 promoter contained in this 
vector was released and removed. The PCR promoter fragments ligated into the 
vector, which was then transformed into competent DH5a bacteria (from 
10 Gibco/BRL). DNA for the promoter activity analyses, which are described below, 
was isolated from transformed bacterial clones using the Qiagen plasmid kit. 

Example 10 

15 The activity of the hTC promoter was analysed in transient transfections in 

eukaryotic cells. 

All the work with eukaryotic cells was carried out at a sterile workstation. CHO-K1 
and HEK 293 cells were obtained from the American Type Culture collection. 

20 

CHO-K1 cells were kept in DMEM Nut Mix F-12 cell culture medium (from Gibco- 
BRL, order number: 21331-020) containing 0.15% streptomycin/penicillin, 2mM 
glutamine and 10% FCS (from Gibco-BRL). 

25 HEK 293 cells were cultured in DMOD cell culture medium (from Gibco-BRL, order 

number: 41965-039) containing 0.15% streptomycin/penicillin, 2 mM glutamine and 
10% FCS (from Gibco-BRL). 

CHO-K1 and HEK 293 cells were cultured at 37°C in a water-saturated atmosphere 
30 while being gassed with 5% CCb* When the cell lawn was confluent, the medium 

was sucked off, after which the cells were washed with PBS (100 mM KH7PO4 pH 
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7.2; 150 mM NaCl) and released by adding a trypsin-EDTA solution (from Gibco- 
BRL). The trypsin was inactivated by adding medium and the cell count was 
determined using a Neubauer counting chamber in order to plate out the cells at the 
desired density. 

5 

For the transfection, in each case 2x 10 5 HEK 293 cells were plated out, per well, in a 
24-well cell culture plate. The HEK 293 medium was removed after 3 hours. For the 
transfection, up to 2.5 jig of plasmid DNA, 1 jig of a CMV B-Gal plasmid construct 
(from Stratagene, order numner: 200388), 200 of serum-free medium and 10 |il of 

10 transfection reagent (DOTAP from Boehringer Mannheim) were incubated at room 
temperature for 15 minutes and then dropped uniformly onto the HEK 293 cells. 1.5 
ml of medium were added after 3 hours. The medium was changed after 20 hours. 
After a further 24 hours, the cells were harvested for determining the luciferase 
activity and the B-Gal activity. For this, the cells were lysed, at room temperature for 

15 15 minutes, in the cell culture lysis reagent (25 mM Tris [pH 7.8] containing H3PO4; 

2 mM CDTA; 2 mM DTT; 10% glycerol; 1% Triton X-100). Twenty ^1 of this cell 
lysate were mixed with 100 jil of luciferase assay buffer (20 mM Tricin; 1.07 mM 
(MgC0 3 ) 4 Mg(OH) 2 '5H 2 0; 2.67 mM MgS0 4 ; 0.1 mM EDTA; 33.3 mM DTT; 
270 ^M coenzyme A; 470 (iM luciferin, 530 jiM ATP), and the light generated by 

20 the luciferase was measured. 

In order to measure the B-galactosidase activity, equal quantities of cell lysate and 8- 
galactosidase assay buffer (100 mM sodium phosphate buffer, pH 7.3; 1 mM MgCb; 
50 mM B-mercaptoethanol; 0.665 mg of ONPG/ml) were incubated at 37°C for at 
25 least 30 minutes or until a slight yellow coloration appeared. The reaction was 

stopped by adding 100 jil of 1 M Na2CC>3, and the absorption was determined at 
420 nm. 



30 



In order to analyse the hTC promoter, four hTC promoter sequence segments of 
differing length were cloned 5* in front of the luciferase reporter gene (cf. Example 
9). 
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The relative luciferase activities of two independent transfections in HEK 293 cells, 
using the constructs NPK8, NPK15, NPK22 and NPK27, are plotted in Fig. 11. Each 
experiment was carried out in duplicate. The standard deviation has also been given. 
The construct NPK 27 exhibits a luciferase activity which is 40 times higher than the 
basal activity of the promoterless luciferase control construct (pGL2 -basic) and from 
2 to 3 times higher than that of the SV40 promoter control construct (pGL2PRO). 
Interestingly, a luciferase activity which was from 2 to 3 times lower than that 
obtained with the NPK 27 construct was observed in cells which were transfected 
with longer hTC promoter constructs (NPK8, NPK15, NPK22). Similar results were 
also observed in CHO cells (data not shown). 
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<110> Bayer AG 



<120> Regulatory DNA sequences from the 5i region of the gene 
for the human catalytic telomerase subunit, and 
their diagnostic and therapeutic use 

<130> LeA32805~Foreign Countries 

<140> 
<141> 

<160> 20 

<170> Patentln Vers. 2.0 

<210> 1 

<211> 5126 

<212> DNA 

<213> Homo sapiens 



<400> 1 

gagctctgaa 

gggtaatgaa 

ccgttccttc 

tcgggtgtga 

gagcacgggc 

accaggctgg 

caggcactcc 

gagactcagc 

gcc tcagctt 

tcacc tgtcc 

ggtgtctgtc 

agactggctc 

ggctgcacgc 

ccggtgtgtt 

ctagggtctc 

ggaaatgcaa 

caggcctggg 

ccccctccct 

aagacccagc 

cgcacatcat 

caaagcaggg 

tccgcacggt 

catctcaagg 

gctcaaaaag 

gggggcggca 
atggtattgg 
ggc tgtgcca 
acgtcctgat 
gtgatctccg 
gtaatccagg 
gatggaggca 
gcgcctccag 
gcacggctgg 
agggcac teg 
acccatgcac 
cagggctgaa 
ttattttatt 
ageggcatga 
tcagcctccc 
tt ttagtaga 
gtgatccgcc 
ggee tatt ta 
ca tggagt tc 
ttegtagact 
tcccatggga 
tgccatctgc 
atctcaatgt 
actgggat tg 
tggaggaagg 
tgttggtttg 
agtgcaatgg 
ctgcttccgc 
tttgtatttt 
gaacttctga 



ccgtggaaac 
gtggtgtgca 
catcattatt 
caagecatga 
cacacccctg 
ggtgacaaca 
cccaaattc t 
ctggggtgcc 
ctccagcagc 
cac tgtgtct 
tccttcccca 
c tc tgagect 
tgacctccat 
cttctgtttc 
ggggtt ttta 
catttgggtg 
gatggagece 
ctggaacaca 
attggcaccc 
gtacacactc 
aaatccctgc 
ggacagttcc 
gaattacget 
aaagaatttc 
gctgggggct 
ctcagttatg 
tetttgecat 
tcccccaaac 
tgaggaccct 
ggttctggga 
gtcagtctga 
aagctggaaa 
cccttagccc 
cgctgccct t 
tgtgaatcta 
gtgcctccgg 
tacttacttt 
tct tggctca 
aagtagctgg 
gatgggctt t 
cacc teagee 
accattttaa 
aatttcccc t 
ggggatacac 
cccactgcag 
cagtagaaac 
ctcagtgtgt 
agccccttcc 
aatgatact t 
tttgttttgt 
cgcgatcttg 
ctcccatttg 
tagtagagac 
cctcagatga 



gaacatgacc 
ggaaatggcc 
catcttcacc 
caaaac tcag 
atatattaag 
geggctgaac 
agggcctggt 
acactgaggc 
ttcctaaacc 
tgtctcagcg 
acactcacat 
gaacctggct 
t tccaggcgc 
tgtgctcctt 
taggcatagg 
tgaaagtagg 
ccgccaggga 
gagtggcagt 
ctggacattt 
ccgtccacga 
taaaatgtcc 
tcacagtgaa 
gagtcaaaac 
accccatggc 
actgcacgca 
ggagactaac 
gcccgagtgt 
ctgtggacag 
gaggtctggg 
agaggeggge 
ggctgaaaag 
aagcggggaa 
accagggccc 
ctagcatgaa 
ggattatttc 
gcaagggcag 
ctgagacaga 
ctgcaacc tc 
gat t tcaggc 
caccatgt tg 
tcccaaagtg 
aacttccctg 
ttac tcagga 
egtetc ttga 
gggcagctgg 
ctgatgtaga 
gctgaaacat 
ctatcccccc 
tgttattttt 
tttgagaggc 
get tactgea 
gctgggatta 
gggggtgggt 
tccacctgcc 



cttgcctgcc 
atgtaaatta 
cccaaggact 
tacaaacacc 
agtccaggag 
agtctgttcc 
tgctgcttcc 
cagccctgtc 
ctgggtgggc 
aegtagcteg 
gcgttgaagg 
cgtggccccc 
tccccgtctc 
tccacgtcca 
aegggggegt 
agtgcctgtc 
cccgccct tc 
ttccacaagc 
gccccacagc 
ccgacccccg 
t ttaacaaac 
gaggaacatg 
tgccacctcc 
aggggagtgg 
ccttttacta 
cataggggag 
cctgggcagg 
aacccgcccg 
atectteggg 
aggagggtca 
ggagggaggg 
gggaccctcc 
atcgtggacc 
gtgtgtgggg 
aaaacaaagg 
ggcaggcacg 
gttatgetet 
cgtctcctgg 
gtgcaccacc 
gtcaagctga 
c tgggattac 
ggctcaagtc 
gttaccctcc 
catat tcaca 
gaggctgeag 
ateagggege 
gtagaaatta 
ccaggggcag 
cactgc tggt 
ggtt tcactc 
gcctctgcct 
caggcacccg 
ggggttcacc 
tctgcctcct 



tgcttccctg 
cacgactctg 
gaatgattcc 
actcttttac 
agatgaggc t 
tctagactag 
cgagggcgcc 
tccacaccc t 
cgtgttccag 
cacggttcct 
gaggagat tc 
gatgeaggtt 
ctgtcatctg 
gctgcgtgtg 
ggtgggccag 
ct cacctagg 
tctgcccagc 
actaagcatc 
cctgggaatt 
ctgttttatt 
tggttaaaca 
ccgtttataa 
atgggatacg 
ttaggggggt 
aagccagttt 
tggggatggg 
ataatgetet 
gccccagggc 
actacctgca 
gaggggggca 
cctcgagccc 
aeggagectg 
tccggcctcc 
atttgeagaa 
t t tacagaaa 
agtgatttta 
tgt tgcccag 
gttcaagcaa 
acacccggct 
tctcaaaatc 
aggcatgagc 
acacccactg 
tttgatattt 
gtttctgtga 
get tcaggtc 
aagtgtggac 
aagtccatcc 
aggagttcct 
actgaatcca 
ttgttgctca 
cccaggttca 
ccaccatgcc 
atgttggcca 
aaagtgctgg 



ggtgggtcaa 
ctgatgggga 
agcaacttct 
taggcccaca 
get ttcagee 
tagaccctgg 
ate tgece tg 
ccgcctccag 
cgctactgtc 
cctcacatgg 
tgcgcctccc 
cctggcgtcc 
ccggggcctg 
tctctgcccg 
ggcgctcttg 
tccacgggca 
actttcctgc 
ctcttcccaa 
caegtgacta 
ttaatagcta 
aacgggtcca 
agectgeagg 
tacgeaacat 
taaggacggt 
cctggttctg 
ggaacccgga 
agagatgece 
etttgeaggt 
ggcccgaaaa 
gcctcaggac 
aggectgeaa 
cagcaggaag 
gtgecatagg 
gcaacaggaa 
catccaagga 
tttagctatt 
gctggagtgc 
ttctcgtgcc 
aattttgtat 
c tgacc tcag 
cactgcacct 
gtaaggagt t 
tctgtaattc 
ccacctgtta 
ccagtggggt 
actgtcctga 
ctcctactct 
ctcactcctg 
ctgtttcatt 
ggctggaggg 
agtgattctc 
cagctaattt 
ggctggtctc 
gattacaggt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

340 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 
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gtgagccacc 
gaagctcacc 
tgttagaaca 
acacac taac 
tgccgggagg 
ttccatttct 
aaccagtgta 
cctagtggca 
ggatttctag 
gagcgtgaca 
aatttcctcc 
atttcagtgt 
tttctcgccc 
cagctgtcct 
tctactgctg 
gcctggaccc 
catctgccag 
ccggtgcgcg 
cccggtgggt 
gagaacctgc 
tgcagggagg 
tcgtccccag 
ccggagcccg 
gcggccaaag 
cgggttaccc 
cctgcaccct 
ccgcccggag 
gacgcccagg 
ccccttcacc 
gggtccccgg 
cgccc tctcc 
ccctggcccc 



atgcccagct 
ccactcaagt 
ctcttgatgt 
tgcacccata 
cgtttcctcg 
tctcttccct 
agctacaact 
gagacaattc 
aagagcgacc 
gcccagggag 
ggcagtttct 
ttgccgacct 
ccttagatcc 
gcggttgtgc 
ggctggaagt 
cgaggctgcc 
acagagtgcc 
gccagcagga 
gat taacaga 
aaagagaaat 
cactccggga 
ccgcgtctac 
acgccccgcg 
ggtcgccgca 
cacagcctag 
gggagcgcga 
cagctgcgct 
accgcgctcc 
ttccagctcc 
cccagccccc 
tcgcggcgcg 
ggccaccccc 



cagaatt tac 
gttgtggtgt 
tttacactgt 
atactggggt 
ccatgcacat 
cttttaaaat 
taacttttgt 
acaaacacag 
tgtaatccta 
ggtgcgaggc 
gaaagtagga 
cagctacagc 
aaacttgagc 
cggggcccca 
cgggcctcct 
ctccaccctg 
ggggcccagg 
gcgcctggct 
tttggggtgg 
gacgggcctg 
ggtcccgcgt 
gcgcctccgt 
tccggacctg 
cgcacctgtt 
gccgattcga 
gcggcgcgcg 
gtcggggcca 
ccacgtggcg 
gcctcctccg 
tccgggccct 
agt ttcaggc 
gcgatg 



tctgtttaga 
tttaagccaa 
gatgactaag 
gtcttctggg 
ggtgttaatt 
tgtgttttct 
tggaacaaat 
ccctttaaaa 
agtatttaca 
ctgttcaaat 
aaggttacat 
atccctgcaa 
aacccggagt 
ggtctggagg 
agctctgcag 
tgcgggcggg 
gtcaaggccg 
ccatttccca 
tttgctcatg 
tgtcaaggag 
gcccgtccag 
cctccccttc 
gaggcagccc 
cccagggcct 
cctctctccg 
ggcggggaag 
ggccgggctc 
gagggactgg 
cgcggacccc 
cccagcccct 
agcgctgcgt 



aacatctggg 
tgatagaat t 
acatcatcag 
tatcagcaat 
actccagcat 
atgttggctt 
tttccaaacc 
aggcttaggg 
agacgaggct 
gctagctcca 
ttaaggttgc 
ggcctcggga 
ctggattcct 
ggaccagtgg 
tccgaggctt 
atgtgaccag 
t tgtggctgg 
ccctttctcg 
gtggggaccc 
cccaagtcgc 
ggagcaatgc 
acgtccggca 
tgggtctccg 
ccacatcatg 
ctggggccct 
cgcggcccag 
ccagtggatt 
ggacccgggc 
gccccgtccc 
ccccttcctt 
cctgctgcgc 



tctgaggtag 
tttttattgt 
cttttcaaag 
cttcattgaa 
aatcttctgc 
ctctgcagag 
gcccctttgc 
atcactaagg 
aacctccagc 
taaataaagc 
gtttgttagc 
gacccagaag 
gggaagtcct 
ccgtgtggct 
ggagccaggt 
atgttggcct 
tgtgaggcgc 
acgggaccgc 
ctcgccgcct 
ggggaagtgt 
gtcctcgggt 
ttcgtggtgc 
gatcaggcca 
gcccctccct 
cgctggcgtc 
acccccgggt 
cgcgggcaca 
acccgtcctg 
gacccctccc 
tccgcggccc 
acgtgggaag 



3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5126 



<210> 2 
<211> 4042 
<212> DNA 

<213> Homo sapiens 



<400> 2 

gtttcaggca 

cgatgccgcg 

aggtgc tgcc 

agcgcgggga 

gggacgcacg 

tggtggcccg 

gcttcgcgct 

gcagctacct 

tgctgcgccg 

tgctggtggc 

ctgccactca 

aacgggcc tg 

gtgcgaggag 

gtggcgctgc 

gcaggacgcg 

aagaagccac 

gccgccagca 

cttgtccccc 

age tgcggcc 

tcgtggagac 

cccgcc tgcc 

aegegcagtg 

ccccagcagc 

aggaggacac 

aggtgtacgg 

ccaggcacaa 

atgecaaget 

tgegcaggag 

tcctggccaa 

tcttttatgt 

tctggagcaa 

agctgtcgga 

gactccgct t 

tgggagccag 

cactgt tcag 

tgctgggcc t 

aggacccgcc 

tcccccagga 

gcgtgcgtcg 



gege tgcgtc 
cgctccccgc 
gc tggccacg 
cccggcggct 
gccgcccccc 
agtgctgcag 
gctggacggg 
gcccaacacg 
cgtgggcgac 
tcccagctgc 
ggcccggccc 
gaaccatagc 
gegeggggge 
ccctgagccg 
tggaccgagt 
ctctttggag 
ccacgcgggc 
ggtgtacgcc 
ctccttccta 
catc 1 1 tctg 
ccagcgctac 
cccc tacggg 
cggtgtctgt 
agacccccgt 
ct tegtgegg 
cgaacgccgc 
ctcgctgcag 
cccaggggt t 
gt tcctgcac 
cacggagacc 
gt tgeaaage 
agcagaggtc 
catccccaag 
aacgttccgc 
cgtgc tcaac 
ggacgatatc 
gcccgagccg 
caggc tcacg 
gtatgccgtg 



ctgc tgegea 
tgccgagccg 
ttcgtgcggc 
t tccgcgcgc 
gccgccccct 
aggctgtgcg 
gcccgcgggg 
gtgaccgacg 
gacgtgctgg 
gcctaccagg 
ccgccacacg 
gtcagggagg 
agtgccagcc 
gageggaege 
gaccgtggtt 
ggtgcgctct 
cccccatcca 
gagaccaagc 
ctcagc tc tc 
ggttccaggc 
tggcaaatgc 
gtgctcctca 
gecegggaga 
cgee tgg tgc 
gec tgcc tgc 
ttcctcagga 
gagctgacgt 
ggctgtgttc 
tggctgatga 
aegt ttcaaa 
attggaatca 
aggcagcatc 
cc tgaeggge 
agagaaaaga 
tacgageggg 
cacagggcct 
tac t t tgtca 
gaggtcatcg 
gtccagaagg 



cgtgggaagc 
tgcgctccct 
gcctggggcc 
tggtggccca 
ccttccgcca 
agcgcggcgc 
gcccccccga 
cactgegggg 
ttcacctgct 
tgtgcgggcc 
ctagtggacc 
ccggggtccc 
gaagtctgee 
ccgttgggca 
tctgtgtggt 
ctggcacgcg 
catcgcggcc 
acttcctcta 
tgaggcccag 
cc tggatgee 
ggcccc tgtt 
agaegcac tg 
agccccaggg 
age tgc tccg 
gccggctggt 
acaccaagaa 
ggaagatgag 
cggccgcaga 
gtgtgtacgt 
agaacaggct 
gacagcactt 
gggaagccag 
tgeggecgat 
gggccgagcg 
cgcggcgccc 
ggcgcacctt 
aggtggatgt 
ccagcatcat 
ccgcccatgg 



cctggccccg 
gc tgegcage 
ccagggctgg 
gtgcctggtg 
ggtgtcctgc 
gaagaacgtg 
ggccttcacc 
gageggggeg 
ggcacgctgc 
gccgctgtac 
ccgaaggcgt 
cctgggcctg 
gttgcccaag 
ggggtcctgg 
gtcacctgcc 
ccactcccac 
accacgtccc 
c tcctcaggc 
cctgactggc 
agggactccc 
tctggagctg 
cccgctgcga 
ctctgtggcg 
ccagcacagc 
gcccccaggc 
gttcatctcc 
cgtgcgggac 
gcaccgtctg 
cgtcgagctg 
ctttttctac 
gaagagggtg 
gcccgccctg 
tgtgaacatg 
tctcacc teg 
cggcctcctg 
cgtgctgcgt 
gaegggegeg 
caaaccccag 
gcacgtccgc 



gccacccccg 
cactaccgcg 
cggctggtgc 
tgcgtgccct 
ctgaaggagc 
ctggccttcg 
accagcgtgc 
tgggggctgc 
gcgctctttg 
cagctcggcg 
ctgggatgcg 
ccagccccgg 
aggcccaggc 
gcccacccgg 
agacccgccg 
ccatccgtgg 
tgggacaege 
gacaaggagc 
gc teggagge 
cgcaggttgc 
cttgggaacc 
getgeggtea 
gcccccgagg 
agcccctggc 
c tctggggct 
c tggggaagc 
tgeget tggc 
cgtgaggaga 
ctcaggtctt 
eggaagagtg 
cage tgcggg 
ctgacgtcca 
gaetaegteg 
agggtgaagg 
ggegee tc tg 
gtgcgggccc 
tacgacacca 
aacaegtae t 
aaggee t tea 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 
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agagccacgt 
tgcaggagac 
aggccagcag 
tcaggggcaa 
tgctctgcag 
acgggctgct 
cgaaaacctt 
tgcggaagac 
ttcagatgcc 
tggaggtgca 
tcaaccgcgg 
tgaagtgtca 
acatctacaa 
cat ttcatca 
cctccctctg 
gcgccgccgg 
tcaagctgac 
agacgcagct 
acccggcact 
aggccgagag 
ggcggcccac 
aggcctgcat 
aagggctgag 
caccccaggg 
tccatcccca 
accatccagg 
aggtgtgccc 
ttggggggag 
aaaaaaaaaa 



ctctaccttg 
cagcccgctg 
tggcctcttc 
gtcctacgtc 
cctgtgctac 
cctgcgtttg 
cctcaggacc 
agtggtgaac 
ggcccacggc 
gagcgactac 
cttcaaggct 
cagcctgttt 
gatcctcctg 
gcaagtttgg 
ctactccatc 
ccctctgccc 
tcgacaccgt 
gagtcggaag 
gccctcagac 
cagacaccag 
acccaggccc 
gtccggctga 
tgtccagcac 
ccagcttttc 
gattcgccat 
tggagaccct 
tgtacacagg 
gtgctgtggg 
aaaaaaaaaa 



acagacctcc 
agggatgccg 
gacgtcttcc 
cagtgccagg 
ggcgacatgg 
gtggatgatt 
ctggtccgag 
ttccctgtag 
ctattcccct 
tccagctatg 
gggaggaaca 
ctggatttgc 
ctgcaggcgt 
aagaacccca 
ctgaaagcca 
tccgaggccg 
gtcacctacg 
ctcccgggga 
ttcaagacca 
cagccc tgtc 
gcaccgctgg 
aggctgagtg 
acctgccgtc 
ctcaccagga 
tgttcacccc 
gagaaggacc 
cgaggaccct 
agtaaaatac 
aa 



agccgtacat 
tcgtcatcga 
tacgcttcat 
ggatcccgca 
agaacaagct 
tcttgttggt 
gtgtccctga 
aagacgaggc 
ggtgcggcct 
cccggacctc 
tgcgtcgcaa 
aggtgaacag 
acaggtttca 
catttttcct 
agaacgcagg 
tgcagtggct 
tgccactcct 
cgacgctgac 
tcctggactg 
acgccgggct 
gagtctgagg 
tccggctgag 
ttcacttccc 
gcccggcttc 
tcgccctgcc 
ctgggagctc 
gcacctggat 
tgaatatatg 



gcgacagttc 
gcagagctcc 
gtgccaccac 
gggctccatc 
gtttgcgggg 
gacacctcac 
gtatggctgc 
cctgggtggc 
gctgctggat 
catcagagcc 
actctttggg 
cctccagacg 
cgcatgtgtg 
gcgcgtcatc 
gatgtcgctg 
gtgccaccaa 

ggggtcactc 

tgccctggag 
atggccaccc 
ctacgtccca 
cctgagtgag 
gcctgagcga 
cacaggctgg 
cactccccac 
ctcctttgcc 
tgggaatttg 
gggggtccct 
agtttt tcag 



gtggctcacc 
tccctgaatg 
gccgtgcgca 
ctctccacgc 
attcggcggg 
ctcacccacg 
gtggtgaact 
acggcttttg 
acccggaccc 
agtctcacct 
gtcttgcggc 
gtgtgcacca 
ctgcagctcc 
tctgacacgg 
ggggccaagg 
gcattcctgc 
aggacagccc 
gccgcagcca 
gcccacagcc 
gggagggagg 
tgtttggccg 
gtgtccagcc 
cgctcggctc 
ataggaatag 
ttccaccccc 
gagtgaccaa 
gtgggtcaaa 
ttttgaaaaa 



2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4042 



<210> 3 
<211> 11276 
<212> DNA 

<213> Homo sapiens 



<400> 3 

act tgagccc 

tggtgacaga 

tct tctctgg 

atacaaacac 

t taaaaagga 

acccacggta 

tcaaaaaagt 

gccaaggcgg 

aaccttgtcg 

cage tac teg 

gagcegggat 

aaaaaaaaaa 

aaaagcaaga 

cagaaataaa 

ttt tgaaaag 

acc taaataa 

caaaggatca 

aaaatagata 

agcccaaaca 

agagaagece 

gaattccaat 

tctacatggc 

aacaaaaaaa 

aaatcctcaa 

gtgatcaagt 

atgtgataca 

cagaaaaagc 

tatacaagaa 

aggecaaggt 

gagacc tggt 

tag tcccagc 

tgcagtgagc 

gaa taagaag 

aggaggtgga 

1 1 tcaacata 

agee 1 1 tec t 

tagtactaga 

ctggaaagga 

gac ttaagac 

caatgtacaa 

caaaaaagca 

tctacaatga 



aagagttcaa 
atgagaccct 
ccacagtgga 
atgaaaatta 
aattgaaaaa 
tacagcaaaa 
agaaaageca 
gcagatcgcc 
ctac taaaaa 
ggaggctgag 
tgegecat tg 
aagtagaaaa 
gcaaac taaa 
tgaaactgaa 
ataaacaaaa 
ataaagtcag 
ctagaggcta 
aattcctaga 
gaccaataac 
aggacccaat 
cctactcaaa 
cagtattacc 
cagaaagaaa 
caaaacac ta 
gggatttatt 
tcatcccaac 
atttgataaa 
acatacaggc 
gggatgat tg 
ctacaaaaaa 
tagtc tggag 
catgaacatg 
aaggagaagg 
ggagaagtgg 
ataaaagece 
ctaagatctg 
agtcctagc t 
agaagtcaaa 
accactaaaa 
aaatcagtag 
gctacaaata 
aaactataaa 



ggctacggtg 
gtctcaaaaa 
acaaaaccag 
aacaatatac 
tttatttaag 
gcagtgctaa 
ggcgcagtgg 
tgaggtcagg 
tacaaaatta 
gcaggataac 
gac tccagcc 
act taaaaat 
cctaaaattg 
agataacaat 
ttgacaaacc 
agatgaaaaa 
ctatgagcaa 
tgcatacaac 
aataatggga 
ggcttccctg 
ctattctgaa 
ctgattccaa 
gaaaac taca 
gcaaaccaaa 
ccagggatgg 
aaaatgaagt 
attctgcacc 
caggcacagt 
cttgggccca 
c tt ttt taaa 
gctgaggtgg 
tcactg tact 
agaagggaga 
aaggggaagg 
tatatgacag 
gaaaa tgaca 
agagcaa t ca 
ttatcctgt t 
aac tat taga 
tat ttc tata 
aaa ttaaaca 
acg ttgacaa 



agecatgatt 
aaaaaaaaaa 
aaatcaacaa 
ttctgaatga 
caaatgataa 
gaaggaagt t 
ctcatgcctg 
agt tcgagac 
gctgggcatg 
cgcttgaacc 
tgggtaacaa 
acaacctaat 
gtaaaagaaa 
acaaaagatc 
tttgeccaga 
agagacatta 
ctgtacacta 
ctaccaagat 
ttaaagecat 
ctggatttta 
aaatagagga 
aaccagacaa 
ggecaatate 
t taaacaaca 
aaggatggct 
acaaaaacta 
ct tcatgata 
ggctcacacc 
ggagtt tgag 
aaattageca 
gagaatcact 
ccagcc taga 
agggagggag 
ggaagggaaa 
accgaggtag 
agggcccact 
gataagagaa 
tgcagatgat 
gctgaaatt t 
ttccaacagc 
gctaggaatt 
aagaaat tga 



gcaacaccac 
aat tgaaata 
caagaggaat 
ccagtgagtc 
eggaaacata 
tatagctata 
taatcccagc 
cagcctgacc 
gtggcacatg 
caggaggtgg 
gagtgaaacc 
gatgeacett 
agaaataata 
aacaaaatta 
ctaagaaaaa 
caac tgatac 
ataaattgaa 
tgaaccatga 
aataaaaagt 
ccaatcatt t 
aagaatactt 
aaacacatca 
cctgatgaat 
ccttcgaaag 
caacatatgc 
tatgattat t 
aaaaccctca 
tgcgatccca 
actagectgg 
ggcatgatgg 
taagee tagg 
caacagaaca 
aagggaggag 
gaggaagaag 
tat tatgagg 
t tcaccac tg 
agaaataaaa 
atgatct tat 
ggtacagcag 
aaacaatctg 
aaccaaagaa 
agagggcaca 



acgccagcct 
atataaagca 
tttgaaaact 
aatgaagaaa 
acctctcaaa 
agcagc taca 
act ttgggag 
aacacagaga 
cctgtaatcc 
aggttgcggt 
ctgtctcaag 
aaagaactag 
aagatcagag 
aaagttggtt 
aggaaagaag 
cacagaaat t 
aaacctagaa 
agaaatccaa 
ctcctagcaa 
aaagaagaat 
ccaaactcat 
aaaacaaaca 
actgatacaa 
atcattcatt 
aaatcaatca 
tcactttatg 
aaaaaccagg 
gcactctggg 
gcaacaaaat 
catatgectg 
aggtcgaggc 
agaccccact 
gaggagaagg 
aagaaacata 
aaaaac tgaa 
tgat tcaaca 
ggcatccaaa 
atctggaaaa 
ga tacaaaa t 
aaaaagaaac 
gtgaaagatc 
aaaaaagaaa 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1360 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 



Le A 32 805-^ pgn Countries 
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agatattcca 
agcaatttac 
agaagaaaca 
cctgaccaaa 
agctatagta 
gaacagaata 
aggtgccaag 
ctggatatcc 
aaatcaaaat 
aacaccggag 
caggcacagg 
tgcccagcaa 
tttgcaaact 
ctctataaga 
catttctcaa 
ctgatcatca 
atggctttta 
acccttggac 
ttcctcaaaa 
caaaaaaggg 
ttcatagcag 
aaaatgtggt 
tcagttgcaa 
aaagacaaac 
agaaatagag 
taatttattg 
gaaaggataa 
ttgtatgcct 
aattaaaatt 
gccgaggcgg 
aaccctgtct 
ccaac tactc 
tgagccgaga 
acaaaaacaa 
c tactatatt 
aaataagaac 
tggcagaaat 
taagtgactt 
cctaaaacaa 
tatctctaaa 
tgcttttttt 
ate tgtatga 
tggcaggaag 
aagacccagg 
gggegatgag 
gcaaagattg 
agcctcctgc 
agaggagtca 
aaggtgtat t 
gcacccttct 
ccctc tcttg 
gcccgtgctg 
ctgagcctgg 
caaagactta 
tcttaatatt 
aaaacaggaa 
atttt tegee 
gtacacgagg 
gtccccggcc 
t tgggcaacg 
gcccacccac 
cttgcctgcc 
atg taaat ta 
cccaaggact 
tacaaacacc 
agtccaggag 
agtc tgt tec 
tgetget tec 
cagccctgtc 
ctgggtgggc 
aegtage teg 
gcgt tgaagg 
cgtggccccc 
tccccgtc tc 
tccacgtcca 
aegggggegt 
agtgcctgtc 
cccgccc t tc 



tgttcataga 
aaattcaatg 
attctaagat 
aagaacaaaa 
acccaaacta 
gagaatccag 
aacatacttt 
atatgeaaaa 
ggatgaaagg 
aaactctcca 
caaccaaagc 
aggaaacaat 
attcatctaa 
aaaacaccta 
aataagtcat 
gagaaatgea 
ttcaaaagac 
actgttggtg 
aactaaaaat 
aatcagtgta 
ccaaggtttg 
gcacatacac 
cagcatgggg 
ttttcatgtt 
gagaatggtg 
tatgttttaa 
atgcttgaag 
gtatcaaaat 
ttaatggcca 
gtggatcacc 
ctactaaaga 
aggaggctga 
tcatgccact 
aaaaaagaag 
agaagttaaa 
aatgtatgtg 
gtgaggaggg 
aattttaacc 
ctgctaataa 
ategagctge 
et tgtgtgct 
atcctgaaac 
caggtggctc 
tgcaaggcag 
aagcctgcct 
c tctggatac 
tcaaacccag 
aggcacctcg 
c tgagggaag 
caagggaaaa 
ccctcgcggt 
aggaccctct 
cttaataaca 
attccatgag 
ttcttaaatt 
ctgagctatg 
ctaagtactt 
agaggectgg 
tgggaggctg 
egaaggegge 
ac taacccag 
tgcttccctg 
cacgac tc tg 
gaatgattcc 
actct tttac 
agatgaggct 
tc tagactag 
cgagggcgcc 
tccacaccc t 
cgtgt tccag 
caeggt tcct 
gaggagat tc 
gatgeaggtt 
ctgtcatc tg 
gc tgcgtg tg 
ggtgggccag 
c tcacc tagg 
tc tgcccagc 



ttggaagaat 
caatccc tat 
ttgtacagaa 
ctggaagcat 
catggtactg 
aaacaaatcc 
ggggaaaaga 
taacaatact 
cttaaatcta 
ggacattgga 
aaaaacagac 
caacaaagag 
caaggaatta 
ataagctgat 
acaaatggca 
aatcaaaact 
aggcaataac 
ggaatggaaa 
aaagctacca 
tcaacaagct 
gaagcaacct 
aatggagtac 
ggcactggtc 
ctcccttact 
gttctagagg 
aataactaaa 
gtgacagata 
atctcatgta 
ggcacggtgg 
tgaggtcagg 
tacaaaaatt 
gacaggagaa 
gcactgcagc 
at taaaattg 
aat taaaaca 
gggtttctag 
aacagtggaa 
aaagacaggc 
tggtgaaagg 
agaattggca 
tggagatt tt 
gaaaaatggt 
tgtggacctg 
aggectgatg 
cgt tggtgag 
catctggaaa 
gccagcagct 
aagtatggct 
ct tgagttag 
ccagacgccc 
ttctgategg 
tgeaaaggge 
aactgggatg 
taaattcaac 
tcatcaaata 
tt tgecaagg 
tttattggtt 
geggcaggge 
acagcaggac 
cacgctgcgt 
gaagtcaegg 
ggtgggtcaa 
ctgatgggga 
agcaacttct 
taggcccaca 
get ttcagee 
tagaccc tgg 
ate tgece tg 
ccgcc tccag 
cgctactgtc 
cc tcacatgg 
tgegee tccc 
cc tggegtec 
ccggggcc tg 
tc tctgcccg 
ggegetc t tg 
tccacgggca 
act ttcctgc 



aaatactgtt 
taaaatacta 
ccacaaaaga 
cacattacct 
gcataaaaac 
atgeatctae 
taatctcttc 
agaactctgt 
aaacctcaaa 
gtgggcaaag 
aaatgggatc 
aagagacaac 
ataaccagta 
tttcaaaaat 
aacaggcatc 
actatgagag 
aaatgccagt 
ttgctaccac 
tacagcaatc 
atctccactc 
cagtgtccat 
tacgcagcca 
agtatgttaa 
tgtgggagca 
ggtgggggac 
agagtataat 
ccccatttac 
tgctatagat 
ctcatgtccg 
agtttgaaac 
agecaggegt 
ttgettgaac 
ctgggtgaca 
taatttttat 
attataaaag 
cttctgaaga 
gttactgttg 
tgggagaagt 
taatctctat 
egtctgatea 
cgattgtgtg 
ggtgatt tec 
agecact tea 
acccgaggac 
cagegcatga 
aggeggecag 
atggcgccca 
taaatctttt 
gtgecttett 
gc tc tgcggt 
gacagagtga 
tccacagacc 
tggctggggg 
ctttccacat 
acattcagga 
tccaaggact 
t tcataaggt 
tatgagcacg 
cac tgaccgt 
gtgactcagg 
age tctgaac 
gggtaa tgaa 
ccgt tcct tc 
tcgggtgtga 
gagcaeggsc 
accaggctgg 
caggcactcc 
gagactcagc 
gec tcagc tt 
tcacctgtcc 
ggtgtc tgtc 
agac tggc tc 
ggc tgeaege 
ccggtgtgtt 
ctaggg tc tc 
ggaaatgcaa 
caggcctggg 
ccccc tccc t 



aaaatgtcca 
atgacgttct 
cccagaatag 
gacttcaaat 
agatgagaca 
agtgaactca 
aataaatggt 
ctctcaccat 
etttgeaact 
acttcttgag 
atatcaagtt 
ccacagaatg 
tatataagga 
aagcaaaaga 
tgaaaatgtg 
atcatctcat 
gaggatgtgg 
tatggagaac 
ecattgetag 
ccacatttac 
caacagacga 
taaaaaagaa 
gtgaaataag 
aaaattaaaa 
agggtgacta 
tgggttgttt 
cctgatgtga 
ataaacccta 
taatcccagc 
cagtctggcc 
ggtggcacat 
ctgggaggcg 
gagcaagact 
gtacegtata 
gtaattaacc 
agtaaaagtt 
ttagacgetc 
taaagaggca 
taattaccaa 
caccgtcctc 
ttcgtgtttg 
tccagaagaa 
atcttcaagg 
aggaaagctc 
agtgccctta 
egggaatgea 
cccgggcgtg 
tttcacctga 
taaaacagaa 
cat ttacctc 
cccccgtgga 
cccgccctgg 
eggacagega 
ccgaatggat 
c tgcagaaat 
taataaccat 
ggcttagggt 
gcagggccac 
cctccctggg 
aacccatacc 
ccgtggaaac 
gtggtgtgca 
catcattatt 
caagecatga 
cacacccctg 
ggtgacaaca 
cccagattct 
ctggggtgcc 
c tccagcagc 
cactgtgtct 
tcct tcccca 
ctctgagcct 
tgacctccat 
cttctgtttc 
ggggt t t t ta 
catt tgggtg 
gatggagece 
c tggaacaca 



tactacccaa 
tcacagaaat 
ccaaagctat 
tatactacaa 
tggaccagag 
tttttgacaa 
gctggaggaa 
atacaaaagc 
actaaaagaa 
taattccctg 
aaaaagcttc 
ggagaatata 
gctcaaacta 
tctgggtaga 
ctcaacacca 
cccagttaaa 
ataaaaggaa 
agtttgaaag 
gtatatactc 
tgcagcactg 
atggaaaaag 
tgagatcctg 
ccaggcacag 
caattgacat 
gagtcaacaa 
gtaacacaaa 
t tattacaca 
ctatattaaa 
actttgggag 
accatgatga 
acctgtagtc 
gaggttgcag 
ccatctcaaa 
aatatatact 
acttaatcta 
atggccacga 
atactctctg 
ttctataagc 
taattacaga 
teattcaegg 
gttaaac tta 
ttagagtacc 
gtctctggcc 
ggatgggaag 
tttacgcttt 
aggagtcaga 
tgccagaggg 
agcagtgacc 
agtcatggaa 
tttcctctct 
gcttctccga 
agagaggagt 
eggegggatt 
ttggatttta 
ecaaaggegt 
gttcagaggg 
gcaagggaaa 
eggggagaga 
agctgccaca 
ggc ttcc tgg 
gaacatgacc 
ggaaatggcc 
catcttcacc 
caaaac tcag 
atatattaag 
geggctgaac 
agggee tggt 
acac tgaggc 
t tcctaaacc 
tgtctcagcg 
acac tcacat 
gaacctggc t 
t tccaggcgc 
tgtgctcct t 
taggcatagg 
tgaaagtagg 
ccgccaggga 
gagtggcagt 



2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 



Le A 32 Countries 



-64- 



ctcttcccaa 
cacgtgacta 
ttaatagcta 
aacgggtcca 
agcctgcagg 
tacgcaacat 
taaggacggt 
cctggttctg 
ggaacccgga 
agagatgccc 
ctttgcaggt 
ggcccgaaaa 
gcctcaggac 
aggcctgcaa 
cagcaggaag 
gtgccatagg 
gcaacaggaa 
catccaagga 
tttagctatt 
gctggagtgc 
ttctcgtgcc 
aattttgtat 
ctgacctcag 
cactgcacct 
gtaaggagtt 
tctgtaattc 
ccacctgtta 
ccagtggggt 
actgtcctga 
ctcctactct 
ctcactcctg 
ctgtttcatt 
ggctggaggg 
agtgat tctc 
cagctaattt 
ggc tggtctc 
gat tacaggt 
tctgaggtag 
tttttattgt 
cttttcaaag 
cttcattgaa 
aatcttctgc 
ctctgcagag 
gcccct ttgc 
atcac taagg 
aacctccagc 
taaataaagc 
gtt tgttagc 
gacccagaag 
gggaagtcct 
ccgtgtggct 
ggagccaggt 
atgttggcct 
tgtgaggcgc 
acgggaccgc 
ctcgccgcct 
ggggaagtgt 
gtcctcgggt 
t tcgtggtgc 
gatcaggcca 
gcccctccct 
cgctggcgcc 
acccccgggt 
cgcgggcaca 
acccgtcctg 
gacccc tccc 
tccgcggccc 
acgtgggaag 



aagacccagc 
cgcacatcat 
caaagcaggg 
tccgcacggt 
catctcaagg 
gctcaaaaag 
gggggcggca 
atggtattgg 
ggctgtgcca 
acgtcctgat 
gtgatctccg 
gtaatccagg 
gatggaggca 
gcgcctccag 
gcacggctgg 
agggcactcg 
acccatgcac 
cagggctgaa 
ttattttatt 
agcggcatga 
tcagcctccc 
ttttagtaga 
gtgatccgcc 
ggcctattta 
catggagttc 
ttcgtagact 
tcccatggga 
tgccatctgc 
atctcaatgt 
actgggattg 
tggaggaagg 
tgttggtttg 
agtgcaatgg 
ctgcttccgc 
tttgtatttt 
gaacttctga 
gtgagccacc 
gaagctcacc 
tgt tagaaca 
acacac taac 
tgccgggagg 
ttccatttct 
aaccagtgta 
cctagtggca 
ggatttctag 
gagcgtgaca 
aatttcctcc 
atttcagtgt 
tt tctcgccc 
cagctgtcct 
tctactgctg 
gcctggaccc 
catctgccag 
ccggtgcgcg 
cccggtgggt 
gagaacctgc 
tgcagggagg 
tcgtccccag 
ccggagcccg 
gcggccaaag 
cgggttaccc 
cctgcaccct 
ccgcccggag 
gacgcccagg 
cccct tcacc 
gggtccccgg 
cgccctctcc 
ccctggcccc 



ttccacaagc 
gccccacagc 
ccgacccccg 
tttaacaaac 
gaggaacatg 
tgccacctcc 
aggggagtgg 
ccttttacta 
cataggggag 
cctgggcagg 
aacccgcccg 
atccttcggg 
aggagggtca 
ggagggaggg 
gggaccctcc 
atcgtggacc 
gtgtgtgggg 
aaaacaaagg 
ggcaggcacg 
gttatgctct 
cgtctcctgg 
gtgcaccacc 
gtcaagctga 
ctgggattac 
ggctcaagtc 
gttaccctcc 
catattcaca 
gaggctgcag 
atcagggcgc 
gtagaaatta 
ccaggggcag 
cac tgc tggt 
ggtttcactc 
gcctctgcct 
caggcacccg 
ggggt tcacc 
tctgcctcct 
tctgtttaga 
tttaagccaa 
gatgac taag 
gtcttctggg 
ggtgt taatt 
tgtgttttct 
tggaacaaat 
ccc t ttaaaa 
agtatt taca 
ctgt tcaaat 
aaggttacat 
atccctgcaa 
aacccggagt 
ggtctggagg 
agctctgcag 
tgcgggcggg 
gtcaaggccg 
ccatttccca 
tttgctcatg 
tgtcaaggag 
gcccgtccag 
cctccccttc 
gaggcagccc 
cccagggcct 
cc tctctccg 
ggcggggaag 
ggccgggctc 
gagggactgg 
cgcggacccc 
cccagcccct 
agcgc tgcgt 

<210> 4 
<211> 104 
<212> DNA 
<213> Homo 

<400> 4 

gtgggcctcc 

catgcggaga 



actaagcatc 
cctgggaatt 
ctgttttatt 
tggttaaaca 
ccgtttataa 
atgggatacg 
ttaggggggt 
aagccagttt 
tggggatggg 
ataatgctct 
gccccagggc 
actacctgca 
gaggggggca 
cctcgagccc 
acggagcctg 
tccggcctcc 
atttgcagaa 
tttacagaaa 
agtgatttta 
tgttgcccag 
gttcaagcaa 
acacccggct 
tctcaaaatc 
aggcatgagc 
acacccactg 
tttgatattt 
gtttctgtga 
gcttcaggtc 
aagtgtggac 
aagtccatcc 
aggagt tec t 
actgaatcca 
ttgttgctca 
cccaggttca 
ccaccatgcc 
atgttggcca 
aaagtgctgg 
aacatctggg 
tgatagaatt 
acatcatcag 
tatcagcaat 
actccagcat 
atgttggctt 
tttccaaacc 
aggcttaggg 
agacgaggct 
gctagctcca 
ttaaggttgc 
ggcctcggga 
ctggattcct 
ggaccagtgg 
tccgaggctt 
atgtgaccag 
ttgtggctgg 
ccctttctcg 
gtggggaccc 
cccaagtcgc 
ggagcaatgc 
acgtccggca 
tgggtctccg 
ccacatcatg 
ctggggccct 
cgcggcccag 
ccagtggatt 
ggacccgggc 
gccccgtccc 
cccc t tec 1 1 
cctgctgcgc 



sapiens 



ceggggtegg 
geagegcagg 



cgtccggc tg 
cgac tcaggg 



gggttgaggg 
cgc t tccccc 



attggcaccc 
gtacacactc 
aaatccctgc 
ggacagttcc 
gaattacget 
aaagaatttc 
gctgggggc t 
ctcagttatg 
tetttgecat 
tcccccaaac 
tgaggaccct 
ggttctggga 
gtcagtctga 
aagctggaaa 
cccttagccc 
cgctgccctt 
tgtgaatcta 
gtgcctccgg 
tacttacttt 
tcttggctca 
aagtagctgg 
gatgggcttt 
cacctcagcc 
accattttaa 
aatttcccct 
ggggatacac 
cccactgcag 
cagtagaaac 
ctcagtgtgt 
agccccttcc 
aatgatactt 
tttgttttgt 
cgcgatcttg 
ctcccatt tg 
tagtagagac 
cctcagatga 
atgcccagc t 
ccactcaagt 
ctcttgatgt 
tgcacccata 
cgt ttcctcg 
tctcttccct 
agctacaact 
gagacaattc 
aagagegace 
geccagggag 
ggcagtttct 
t tgccgacct 
ccttagatcc 
gcggttgtgc 
ggctggaagt 
cgaggctgcc 
acagagtgee 
gecagcagga 
gattaacaga 
aaagagaaat 
cactccggga 
ccgcgtctac 
acgccccgcg 
ggtcgccgca 
cacagcctag 
gggagegega 
cagctgcgct 
accgcgctcc 
ttccagctcc 
cccagccccc 
tcgcggcgcg 
ggccaccccc 



eggceggggg 
gcag 



ctggacattt 
ccgtccacga 
taaaatgtcc 
tcacagtgaa 
gagtcaaaac 
accccatggc 
actgcacgca 
ggagactaac 
gcccgagtgt 
ctgtggacag 
gaggtctggg 
agaggeggge 
ggctgaaaag 
aagcggggaa 
accagggccc 
ctagcatgaa 
ggattatttc 
gcaagggcag 
ctgagacaga 
ctgcaacctc 
gatttcaggc 
caccatgttg 
tcccaaagtg 
aacttccctg 
ttactcagga 
cgtctcttga 
gggcagctgg 
ctgatgtaga 
gctgaaacat 
ctatcccccc 
tgttattttt 
t ttgagaggc 
gettactgea 
gctgggatta 
gggggtgggt 
tccacctgcc 
cagaatttac 
gttgtggtgt 
tttacactgt 
atactggggt 
ccatgcacat 
ct t ttaaaat 
taacttttgt 
acaaacacag 
tgtaatccta 
ggtgcgaggc 
gaaagtagga 
cagctacagc 
aaac t tgagc 
cggggcccca 
cgggcctcct 
c tccaccctg 
ggggeccagg 
gcgcctggct 
tttggggtgg 
gaegggectg 
ggtcccgcgt 
gcgcctccgt 
tccggacctg 
cgcacctgtt 
gecgattega 
gcggcgcgcg 
gteggggeca 
ccacgtggcg 
gcctcctccg 
tccgggccct 
agt ttcaggc 
gcgatg 



gaaccagega 



7260 

7320 

7380 

7440 

7500 

7S60 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 

8580 

8640 

8700 

8760 

8820 

8880 

8940 

9000 

9060 

9120 

9180 

9240 

9300 

9360 

9420 

9480 

9540 

9600 

9660 

9720 

9780 

9840 

9900 

9960 

10020 

10080 

10140 

10200 

10260 

10320 

10380 

10440 

10500 

10560 

10620 

10680 

10740 

10800 

10860 

10920 

10980 

11040 

11100 

11160 

11220 

11276 



60 
104 
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<210> 5 
<211> 8616 
<212> DNA 

<213> Homo sapiens 



<400> 5 

gtgaggaggt ggtggccgtc gagggcccag gccccagagc tgaatgcagt aggggctcag 6 0 
aaaagggggc aggcagagcc ctggtcctcc tgtctccatc gtcacgtggg cacacgtggc 120 
ttttcgctca ggacgtcgag tggacacggt gatctctgcc tctgctctcc ctcctgtcca 180 
gtttgcataa acttacgagg ttcaccttca cgttttgatg gacacgcggt ttccaggcgc 240 
cgaggccaga gcagtgaaca gaggaggctg ggcgcggcag tggagccggg ttgccggcaa 3 00 
tggggagaag tgtctggaag cacagacgct ctggcgaggg tgcctgcagg ttacctataa 3 60 
tcctcttcgc aatttcaagg gtgggaatga gaggtgggga cgagaacccc ctcttcctgg 420 
gggtgggagg taagggtttt gcaggtgcac gtggtcagcc aatatgcagg tttgtgttta 480 
agatttaatt gtgtgttgac ggccaggtgc ggtggctcac gccggtaatc ccagcacttt 540 
gggaagctga ggcaggtgga tcacctgagg tcaggagttt gagaccagcc tgaccaacat 600 
ggtgaaaccc tatctgtact aaaaatacaa aaattagctg ggcatggtgg tgtgtgcctg 660 
taatcccagc tacttgggag gctgaggcag gagaatcact tgaacccagg aggcggaggc 720 
tgcagtgagc tgagattgtg ccattgtact ccagcctggg cgacaagagt gaaactctgt 780 
ctttaaaaaa aaaaagtgtt cgttgattgt gccaggacag ggtagaggga gggagataag 840 
actgttctcc agcacagatc ctggtcccat ctttaggtat gaagagggcc acatgggagc 900 
agaggacagc agatggctcc acctgctgag gaagggacag tgtttgtggg tgttcagggg 960 
atggtgctgc tgggccctgc cgtgtcccca ccctgttttt ctggatttga tgttgaggaa 1020 
cctccgctcc agcccccttt tggctcccag tgctcccagg ccctaccgtg gcagctagaa 1080 
gaagtcccga tttcaccccc tccccacaaa ctcccaagac atgtaagact tccggccatg 1140 
cagacaagga gggtgacctt cttggggctc ttttttttct ttttttcttt ttatggtggc 1200 
aaaagtcata taacatgaga ttggcactcc taacaccgtt ttctgtgtac agtgcagaat 1260 
tgctaactcg gcggtgttta cagcaggttg cttgaaatgc tgcgtcttgc gtgactggaa 1320 
gtccctaccc atcgaacggc agctgcctca cacctgctgc ggctcaggtg gaccacgccg 1380 
agtcagataa gcgtcatgca acccagtttt gctttttgtg ctccagcttc cttcgttgag 1440 
gagagtttga gttctctgat caggactctg cctgtcattg ctgttctctg acttcagatg 1500 
aggtcacaat ctgcccctgg cttatgcagg gagtgaggcg tggtccccgg gtgtccctgt 1560 
cacgtgcagg gtgagtgagg cgttgccccc aggtgtccct gtcacgtgta gggtgagtga 1620 
ggcgcggccc ccgggtgtcc ctgtcccgtg cagcgtgatt gaggtgtggc ccccgggtgt 1680 
ccctgtcacg tgtagggtga gtgaggcgcc atccccgggt gtccctgtca cgtgtagggt 1740 
gagtgaggcg tggtccccgg gtgtccctgt cccgtgcagg gtgagtgagg cactgtcccc 1800 
gggtgtccct gtcacgtgca gggtgagtga ggcgcggtcc ccgggtgtcc ctctcaggtg 1860 
tagggtgagt gaggcgcggc cccagggtgt ccctgtcacg tgtagggtga gtgaggcacc 1920 
gtccctgggt gtccctccca ggtatagggt gagtgaggca ctgtccccgg gtgtccctgt 1980 
cacgtgcagg gtgagtgagg cgcggccccc gggtgtccct ctcaggtgca gggtgagtga 2040 
ggcgctgtcc ctgggtgtcc ctgtctcgtg tagggtgagt gaggctctgt ccccaggtgt 2100 
ccttggcgtt tgctcacttg agcttgctcc tgaatgtttg ctctttctat agccacagct 2160 
gcgccggttg cccattgcct gggtagatgg tgcaggcgca gtgctggtcc ccaagcctat 2220 
cttttctgat gctcggctct tcttggtcac ctctccgttc cattttgcta cggggacacg 2280 
ggactgcagg ctctcgcctc ccgcgtgcca ggcactgcag ccacagcttc aggtccgctt 2340 
gcctctgttg ggcctggctt gctcaccacg tgcccgccac atgcatgctg ccaatactcc 2400 
tctcccagct tgtctcatgc cgaggctgga ctctgggctg cctgtgtctg ctgccacgtg 2460 
ttgctggaga catcccagaa agggttctct gtgccctgaa ggaaagcaag tcaccccagc 2520 
cccctcactt gtcctgtttt ctcccaagct gcccctctgc ttggccccct tgggtgggtg 2580 
gcaacgcttg tcaccttatt ctgggcacct gccgctcatt gcttaggctg ggctctgcct 2640 
ccagtcgccc cctcacatgg attgacgtcc agccacaggt tggagtgtct ctgtctgtct 2700 
cctgctctga gacccacgtg gagggccggt gtctccgcca gccttcgtca gacttccctc 2760 
ttgggtctta gttttgaatt tcactgattt acctctgacg tttctatctc tccattgtat 2820 
gctttttctt ggtttattct ttcattcctt ttctagcttc ttagtttagt catgcctttc 2880 
cctctaagtg ctgccttacc tgcaccctgt gttttgatgt gaagtaatct caacatcagc 2940 
cactttcaag tgttcttaaa atacttcaaa gtgttaatac ttcttttaag tattcttatt 3000 
ctgtgatttt tttctttgtg cacgctgtgt tttgacgtga aatcattttg atatcagtga 3060 
cttttaagta ttctttagct tattctgtga tttctttgag cagtgagtta tttgaacact 3120 
gtttatgttc aagatatgta gagtatcaag atacgtagag tattttaagt tatcatttta 3180 
ttattgattt ctaactcagt tgtgtagtgg tctgtataat accaattatt tgaagtttgc 3240 
ggagccttgc tttgtgatct agtgtgtgca tggtttccag aactgtccat tgtaaatttg 3300 
acatcctgtc aatagtgggc atgcatgttc actatatcca gcttattaag gtccagtgca 3360 
aagcttctgt ctccttctag atgcatgaaa ttccaagaag gaggccatag tccctcacct 3420 
gggggatggg tctgttcatt tcttctcgtt tggtagcatt tatgtgaggc attgttaggt 3480 
gcatgcacgt ggtagaattt ttatcttcct gatgagtgaa tcttttggag acttctatgt 3540 
ctctagtaat ctagtaattc tttttttaaa ttgctcttag tactgccaca ctgggcttct 3600 
tttgattagt attttcctgc tgtgtctgtt ttctgccttt aatttatata tatatatata 3660 
tttttttttt ttttgagaca gagtcttggt ctgtcgccca gggtgagtgc agtggtgtga 3720. 
tcacaggtca gtgtaacttt taccttctgg cctgagccgt cctctcacct cagcctcctg 3780 
agtagctgga actgcagaca cgcaccgcta cacctggcta atttttaaat tttttctgga 3840 
gacagggtct tgctgtgttg cccaggctgg tctcaaactc ttggactcaa gggatccatc 3900 
tacctcggct tcccaaagtg ctgaattaca ggcatgagcc accatgtctg gcctaatttt 3960 
caacactttt atattcttat agtgtgggta tgtcctgtta acagcatgta ggtgaatttc 4020 
caatccagtc tgacagtcgt tgtttaactg gataacctga tttattttca tttttttgtc 4080 
actagagacc cgcctggtgc actctgattc tccacttgcc tgttgcatgt cctcgttccc 4140 
ttgtttctca ccacctcttg ggttgccatg tgcgtttcct gccgagtgtg tgttgatcct 4200 
ctcgttgcct cctggtcact gggcatttgc ttttatttct ctttgcttag tgttaccccc 4260 
tgatcttttt attgtcgttg tttgcttttg tttattgaga cagtctcact ctgtcaccca 4320 
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ggctggagtg 
gttctcattc 
taatttttgt 
tcctgacctc 
gccaccgtgc 
tcctgagcaa 
ttttccctgc 
ttaatattgt 
tttgttcccc 
tgcgtggttc 
atggcatcta 
tcacgaggag 
cttagccagt 
catgtcgggg 
gcgccagctc 
aaccaggaca 
atgtggataa 
ctttgggagg 
atgatgaaac 
tgtaatccca 
gttgcagtga 
ctcaaaaaaa 
aagaaaaggt 
atcattttag 
tttgtctgcg 
ggc ttcccat 
ccc tcagtga 
ccctgctgtg 
gccctcggtg 
gccctgctgt 
ggccctcggt 
ggccctgctg 
aggccctgcg 
cagacggtgc 
ggtgaggttg 
gtgtgaggtc 
ggggtgaagg 
gtctggagtg 
gtccggggtg 
ggtctggagt 
ggtccggggt 
aggtctggag 
caggtcaggg 
caggtccggg 
gcaggtctgg 
gcaggtccgg 
tgcaggtctg 
tgcaggtccg 
gtgcagtccg 
gtgcaggtct 
gctgcaggtc 
gtccggatgg 
gtccggatgg 
gtctgcatgg 
gtccggatgg 
tgtctggatg 
gtgtccggat 
ggtgtccgga 
cggtgtc tgg 
gctgtatccg 
tgctgtatcc 
atgcggtgtc 
gtgcggtgtc 
tgtgctgtat 
atgtgc tgta 
gatgtgcagt 
gtatgtgtgt 
tggatgtgtg 
tggatatgcg 
tggtgggctg 
tggtgagctg 
cggtgatctg 



taatggcaca 
ctcaacctca 
atttttagta 
aagtgatctg 
ccggcatacc 
taagaccctt 
tgacttagtt 
tttccgtgtt 
gtctgtcttc 
ttctgtcttg 
gcgacgtccg 
ggcggtcatc 
gagtgacagc 
tctggtggct 
tgacggtgct 
aaggatgagg 
ttttaaaatt 
ccaaggcggg 
cccatctgta 
gctactcggg 
gccgacattg 
aaaaaaaaaa 
gaaattaatg 
ggtgttattg 
ggatcccgtg 
ggccatggct 
gctggatgtg 
age tggatgt 
agctggaggt 
gagctggatg 
aagctggagg 
tgagctggat 
gtgagctggg 
cagaccatgc 
ccaggccctg 
accaggccct 
tcgccaggcc 
aggtcgecag 
aggtcgecag 
gaggtcgeca 
gaggtcgeca 
tgaggtcgee 
gtgaggtctc 
gtgaggtege 
ggtgtggtcg 
ggtgaggtcg 
gggtgtggtc 
gggtgaggtt 
gggtgaggtc 
ggggtgaggt 
cggggtgagt 
tgcaggtcca 
tgcaggtctg 
tgcaggtctg 
tgcaggtccg 
gtgeaggtec 
ggtgcaggtc 
tggtgcaggt 
atggtgcagg 
gatggtgcag 
ggatggtgca 
ggatggtgca 
cggatggtgc 
ccggatggtg 
tccggatggt 
gtacggatgg 
tgtctggatg 
gtgtctggat 
gtgtccccgt 
gatgtgccgt 
gatgtgcggt 
gatgtggcat 



atctcggctc 
tgagtagctg 
gagataggct 
cccgccttgg 
ttgatctttt 
agtgtatttt 
ctatctcagg 
gagtgtttct 
tgtctcaggc 
ttattgctgg 
gggacctctg 
ttggcccgtg 
aacgtccgct 
ccgcggtgtc 
gcctggcggg 
ctccgagccg 
tctaggctgg 
tggatcacga 
ctaaaaacac 
aggctgaggc 
caccactgca 
aaaaaaaaaa 
taataataga 
gtgggagcat 
tgtaggtccc 
gttgtaccag 
cagtgtccgg 
gtggtgtctg 
atggagtccg 
tgtggtgtct 
tatggagtcc 
gtgtggtgtc 
tgtgcggtgt 
ggtgagctgg 
ctgtgagttg 
gctgtgagct 
cctgcttgtg 
gccctcggtg 
accctgcggt 
ggccctcggt 
gaccctgctg 
agaccctgct 
caggccctcg 
caggccctgc 
ccaggccctc 
ccaggccctg 
gccaggccct 
gccaggccct 
gccaggccct 
caccaggccc 
tcgccaggcc 
gggtgaggtc 
gggtgaggtc 
gggtgaggtc 
gcgtgaggtc 
ggggtgaggt 
eggggtgagg 
ccggggtgag 
tccggggtga 
gtccggggtg 
ggtctggcgt 
ggtccggggt 
aggtctgggg 
caggtccggg 
gcaggtctgg 
tgcaggtccg 
gtgeaggtec 
gctgcaggtc 
gtccgaa tgg 
gtccggatgg 
gtccggatgg 
gtcct tctcg 



actgcaacct 
ggattacagg 
ttcaccatgt 
cctcccacag 
aaaatgaagt 
agctctggcc 
catcttgaca 
gtagctttgc 
ccgccgtctg 
taaaccccag 
ettatgatge 
agtgtctgga 
cggcctgggt 
gagtttgaaa 
ggagtgtctg 
ttgtcgccca 
gcgcggtggc 
ggtcaggagg 
aaaaattagc 
aggagaattg 
ctccagcctg 
aattctagta 
ttttactgaa 
cactcacagg 
gtgcgtggcc 
atggtgcagg 
atggtgcacg 
gatggtgcag 
gatgatgeag 
ggatggtgca 
ggatgatgea 
tggatggtgc 
ctggatggtg 
atatgeggtg 
gatgtggggt 
ggatgtgtgg 
agctggatgt 
agctggatgt 
gagctggatg 
gagctggatg 
tgagctggat 
gtgagctgga 
gtgagctgga 
tgtgaactgg 
ggtgagctgg 
ctgtgagctg 
cggtgagctg 
gctgtgagct 
gctgtgagct 
tgcggtgagc 
cteggtgage 
gctaggccct 
gccaggcctt 
gccaggccct 
gccaggccct 
agccaaggcc 
tcgccaggcc 
gtcaccaggc 
ggtcgecagg 
aggtcgecag 
gaggtcgeca 
gaggtcacca 
tgaggtcgee 
gtgaggtege 
cgtgaggtcg 
gggtgaggtc 

ggggtgagtt 

cggggtgagt 
tgcaggtcca 
tgcaggtctg 
tgcaggtccg 
tttaag 



ctgcctcctc 
cgcccaccac 
tggecagget 
tgctgggatt 
ctgaaacatt 
accccccagc 
cccccacaag 
ccccgccctg 
gggtcccctt 
ctttacctgt 
acagatgaag 
gcaccacgtg 
tcagcctgga 
tcgcgcaaac 
cttcctccct 
acaggagcat 
tcacgcctgt 
tcgagaccat 
tgggcgtggt 
cttgaacctg 
gcaacacagc 
gecacattaa 
gcccagcatg 
acatttgaca 
atctcggcct 
teegggatga 
tctgggatga 
gtcaggggtg 
gtccggggtg 
ggtcaggggt 
ggtccggggt 
aggtctgggg 
caggtctgga 
tccggatggt 
gtccggatgc 
tgtctggatg 
gtggtgtctg 
gcagtgtcca 
tgcggtgtct 
tatggagtcc 
gtgcggtgtc 
tatgcggtgt 
ggtatggagt 
atgtgcggcg 
aggtatggag 
gatgtgcggc 
gaggtatgga 
ggatgtgctg 
ggatgtgctg 
tggttgtgcg 
tggatgtgcg 
tggtgggctg 
tggtgagctg 
tggtgggctg 
gctgtgagct 
ttcggtgagc 
ctgcggttag 
ectgeggtta 
ccctgctgtg 
gccctgcagt 
ggccctgcgg 
ggccctgcgg 
aggccctgct 
caggccctgc 
ccaggccctg 
gccaggccct 
cgccaggccc 
tcgccaggcc 
gggtgaggtc 
gggtgaggtc 
gggtgaggtc 



ggttcaagca 
cacgcc tggc 
ggtctcaaac 
acaggtgcaa 
gctacccttg 
ctgtgtgctg 
ctaagcatta 
cttttcctcc 
ccttgtcctt 
gctggcctcc 
atgtggagac 
gccagcgttc 
aaaccccagg 
ctgcggtgtg 
tctgcttggg 
gaegtgagee 
aatcccagca 
cctggccaac 
ggcgggtgcc 
ggagttggaa 
gagactctgt 
aaaagtaaaa 
tccacacctc 
ttttttgagc 
ggacctgctg 
ggtcgecagg 
ggtcgecagg 
aggtctccag 
aggtcgecag 
gaggtctcca 
gaggtcgeca 
tgaggtcacc 
gtgaggtege 
gcaggtctgg 
tgcaggtccg 
gtgcaggtct 
gatggtgcag 
gatggtgcag 
ggatggtgca 
ggatggtgcc 
tggatggtac 
ccggatggtg 
ccggatgatg 
tctggatggt 
teeggatgat 
gtctggatgg 
gtccggatga 
tatceggatg 
tatceggatg 
gtgtccggtt 
gtgtccccgt 
gatgtgccgt 
gatgtgcggt 
gatgtgtggt 
ggatgtgcgg 
tggatgtggg 
ctggatatgc 
gctggatgtg 
agctggatgt 
gagctggatg 
t tagctggat 
ttagctggat 
gtgagctgga 
ggtgagctgg 
eggtgagctg 
gcggtgggc t 
tgcggtgagc 
cteggtgage 
gccaggccct 
gccaggccct 
accaggccct 



4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8460 
8520 
8580 
8616 



<210> 6 

<211> 2089 

<212> DNA 

<213> Homo sapiens 
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<400> 6 

gtactgtatc 

agcatgcgcc 

caggggcccc 

gctcacgttc 

aagcagaagg 

gatgtgggtc 

ttgcccaggc 

ttaagcgatt 

gcctggctaa 

ctcgaactca 

ggctaagcca 

ttcaatctat 

cagggagcac 

taggtggctg 

tgagattgtg 

gagatgccag 

tgactgtgga 

tccctggggg 

cagcagacct 

gccatttcct 

aatgcacctt 

accagtattt 

ccccaagatg 

cctcctccct 

agggcaccag 

acagatgccc 

gggaaaaggc 

tccttggctg 

ctttctacc t 

gggacaggca 

caggc tccct 

tccccagggt 

aggcgtctgg 

tgggtgccct 

tagtctgttg 



cccacgccag 
tgtctccact 
gtcacaggcc 
tcttacttgt 
gatttaaatt 
tgattctctc 
tggagtgcag 
caccagcctc 
tttttgtact 
tgacctcagg 
ccgtgcccag 
tggatttagg 
ctgtgcaggg 
catttgaatg 
acagat tcaa 
cctggctgag 
gggctttagt 
gccttgtgac 
cgtcagaggt 
tgcatctggg 
acttagactt 
tggaaagaat 
ctccttgtca 
ggacagggta 
ctccggagca 
aggtccaggt 
caagggcaga 
agctgccctg 
gggggtcctg 
tcctgtgtgg 
ggtgctgatg 
tgactatagg 
ctggcatggg 
gagccctcac 
tctggctgag 



gcctctgc t t 
tgcctgtgct 
tggtccaagt 
aaaatcagga 
agatggaaac 
tctctttttt 
tggcataatc 
agcctcctaa 
tttaggagag 
tgatccaccc 
cccccgattc 
tcatgagagg 
agcacctggg 
gctgtgagat 
gctggatttg 
cccaggccat 
cagaagatca 
accccatgcc 
aacacagcct 
ggagggtcag 
tacacgtatt 
ttaattgggg 
ctactgggac 
ccgtgccttt 
cccgcggccc 
gtggccgctc 
ggtgtcagga 
agcagcctct 
cctggggcca 
aggggcatgg 
gtgggacagt 
accaggtgtc 
tggacgtggc 
tgagtcggtg 
caagcctcct 



ctcgaagtcc 
tccctggctg 
ggattctgtg 
gtttgtgcca 
actaccacta 
ttttcttttt 
ttggctcact 
gtagctggga 
acggggtttc 
accttggcct 
tcttttaatt 
ataaaatccc 
gataggagag 
tttgtctgca 
catcagtgag 
ggtattagct 
gggcttcccc 
ccaaatcagg 
ctgggctggg 
ggctttccct 
taatggtgtg 
tgaccggaag 
tgttgttctg 
tctactctgc 
cagtgtccac 
cagcccccgt 
gactggtggg 
cccgccctct 
gccttgggct 
gttcacgtgg 
caccctgggg 
caggtgccct 
cccgggcatg 
ggggcttgtg 
gaggggctct 



tggaacacca 
tgcagctctg 
caaggctctg 
agtggtctct 
gcctccttgc 
tgagatggag 
gcaacctcca 
ttacaggcac 
accatgttgg 
cccaaagtgc 
catgctgttc 
acccacttgg 
ttccaccatg 
atgttcggct 
ggacgggagc 
tctccgtgtc 
agctcccctg 
atgtctgcag 
gaccccgacg 
gtgggaacaa 
cgacccaaca 
gagcagacag 
cctggggggc 
tgggcctgcg 
ggagtgccag 
gcccccatgg 
ctcatgagag 
ccatctgaag 
accccagtgg 
ccccagatgc 
gttgaccgcc 
gcaagtagag 
gccttcagcg 
gcttcccgtg 
ctattgcag 



gcccggcctc 
ggctgggagc 
actgcctgga 
agggtttgta 
ctttccctgg 
tctcactctg 
cctcctgggt 
ctgccaccac 
ccaggctggt 
tgggtttaca 
tgtatgaatc 
cgactcactg 
agctaacttc 
gatgagagtg 
gctggtctgg 
ccgcccaggc 
cacactcgag 
agggagctgg 
tggtgctggg 
gttaatacac 
tggtcatttg 
acgtggtggt 
cttggaggcc 
gcctgcggtc 
gctgtcagcc 
gtggttttgg 
ctgattctgc 
ggatgtggct 
ctgtaccaga 
agcctgggac 
ggactgggcg 
gggctctcag 
tgtgctgccg 
agcttccccc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2089 



<210> 7 

211> 687 

"<212> DNA 

213> Homo sapiens 



<400> 7 

gtggctgtgc 

gtatcagctt 

cggcgccaac 

cagcgtgggg 

cagccgggcc 

gtgcgcagcc 

agcccaccgg 

ggctgcggtg 

gtgtggcatg 

tctgaggaag 

gggctgcttc 

gactgtctcc 

<210> 8 
<211> 494 
<212> DNA 
<213> Homo 



tttggtttaa 
agatgaaggg 
ccatttgtgc 
gtgtaggggg 
agggcctgga 
tccgtgcgct 
gctctgagga 
gctgcggtga 
aggatcccgt 
ctgggagggg 
tcccctgggt 
catgctgtcc 



sapiens 



cttccttttt 
cccggaggag 
gcacagtgag 
agctcctggg 
tgcagcacgg 
tccgcttacg 
tcctggacct 
ccccgtcatc 
gtgcaacaca 
ttctaggtcc 
ccctatggtg 
ccgccag 



aaacagaagt 
gggccacggg 
gtggccgagg 
gcagggacag 
cccgaggtcc 
gggcccgggg 
tgccccacgg 
tgaggagagt 
catgcggcca 
cgggtctggg 
gggtgggcac 



gcgtttgagc 
acacagccag 
tgccggtgcc 
gctctgagga 
tggatccgtg 
accaggccac 
ctcctgcacc 
gtggggtgag 
ggaacccgtt 
tggctgggga 
ttggccggat 



cccacatttg 
ggccatggca 
tccagaaaag 
ccacaagaag 
tcctgctgtg 
gactgccagg 
ccacccctgt 
gtggacagag 
tcaaacaggg 
cactggggag 
ccactttcct 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

687 



<400> 8 

gtgggtgccg 

gcacc tcatg 

tgtcactgtt 

catggggccg 

gattccagtt 

gcaccccagt 

ccctgccct t 

caggccctga 

ctccac tcac 



gggacccccg 
ttgggtggag 
gaggacacac 
actgtgcacc 
tccgtcagag 
cctgagccag 
ggggtctgga 
gggcagaggt 
acag 



tgagcagccc 
gaggtactcc 
ctggcaccta 
ctgactgccc 
aaggaaccgc 
gggtctcctg 
gtggtggggg 
gatgtctgag 



tgc tggacc t 
tgggtgggcc 
gggtggaggc 
gggc tec tat 
aacggctcag 
tcctgaggct 
tcagagagag 
tt tetgegtg 



tgggagtggc 
gcagggagtg 
cttcagcctt 
tcccaaggag 
ccaccaggcc 
cagagagggg 
agtgggggac 
gccactgtca 



tgee tgat tg 
caggtgaccc 
tec tgeagea 
ggtcccac tg 
ccggtgcc 1 1 
acacagcccg 
accgccaggc 
gtctcctcgc 



60 

120 

180 

240 

300 

360 

420 

480 

494 



<210> 9 

<211> 865 

<212> DNA 

<213> Homo sapiens 



Le A 32 805-^Mj gign Countries 
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<400> 9 

gtaaggttca cgtgtgatag tcgtgtccag 
agaatgcagt cgtgtctgtg atgcgtttct 
ctgtgatatg cgtgtgtggc acgtgtgtgt 
tgtggtgtgt gtgtgtgtgg cacgtgtgtg 
tgtgtgtgtc tgtgacacgt gcatgttcat 
tttgtggtgt gtgtgtgcat gtgtccgtga 
ggccccttgg ccttactcct tcctcctcca 
tcgggtgctg gtttggggag ctccacattc 
gtcctgtcac agggctgggc cttggagact 
gctggtggta ccttcctgga cccctggcac 
tccatgagat ataggaaggc tgattcaggc 
cggccggggg ccttggggct cggcaggggt 
agtggtcatg agcacgctgg aggggtaagc 
ggtgaagaag tatccctgga gcttcggtct 
acctctttct ctgacttctt gagct 



gatgtgtgtc tctgggatat gaatgtgtct 60 
gtggtggagg tacttccatg atttacacat 120 
cgtggtgcat gtatctgtgg cgtgcatatt 180 
tccatggtgt gtgtgcctgt ggtgtgcatg 240 
gctgtgtgct gcatgtctgt gatgtgccta 300 
catatgcgtg tctatggcat gggtgtgtgt 360 
ggcatggtcc gcaccattgt cctcacgctc 420 
agggtcctca cttctagcat gggtgcccct 480 
gtaagccagg tttgagagga gagtagggat 540 
ccccaggacc ccagtctggc ctatgccggc 600 
ctcgctcccc gggacacact cctcccagag 660 
gaaaggggcc ctgggcttgg gttcccaccc 720 
cctcaaagtc gtgccaggcc ggggtgcaga 780 
ggggagaggc acatgtggaa acccacaagg 840 

865 



<210> 10 
<211> 3782 
<212> DNA 

<213> Homo sapiens 



<400> 10 

tgtgggattg 

ggggtaacac 

agctttattg 

ggtgtggagg 

cctgctgtgg 

gatgtggccc 

tttctttttt 

ggctggagtg 

attctcttgc 

aatttttgta 

cctgacctca 

ccgccgcgcc 

cctgcagcct 

ccatttcatg 

gcgtaattgg 

ttgttgtttt 

tctaaacaag 

ctgtggagtg 

ggggtgtggg 

gcggtgctca 
gtgctcc t ta 
catccccttc 
accacaatgg 
atatattggc 
cgacctcaga 
caggccccat 
tcgatgtggt 
agcacagagt 
tgt tagtgtg 
tcccgtagta 
tctctctccc 
gcaatccctc 
gac tgtggat 
tcacaggggt 
gtggatggcg 
tgtggtgact 
gggtctgatg 
ggcggtcgtg 
gac tgtggat 
ctgatgtgtg 
cggtcgtggg 
c tgtggatgg 
gg tctgatgt 
gtggatggcg 
tc tgatgtgt 
ggatggcggt 
tgtggtgact 
gggtctgatg 
ggcggttggt 
gtggtgactg 
ggggtc tgat 
gatggcggtc 
ggtgac tgtg 
acaggggtct 
gtggatggcg 



gttttcatgt 
agagt tcaag 
aggagaccat 
cctcccctgg 
tgtgtggccg 
tggctacgct 
tctttctttt 
gtttggcgtg 
ctcagcctcc 
attttagtag 
ggtgatcctc 
cggccgagac 
tggtgctgac 
actctct tea 
tgtctgctgt 
tccggctcct 
catc tgaagt 
gcaccggtct 
gagecagegt 
gaggegcaca 
tgggaatc ta 
cccactgc tg 
ttggggaccc 
ttttctgtgt 
cccatgggct 
gtacct tcct 
tttagcccac 
caccgtgcgc 
tgtcacgtgc 
aatgacaagc 
gcgtcttcag 
cagcactggg 
ggcagtcggt 
ctgatgtgtg 
gtcgtggggt 
gtggatggcg 
tggtgactgt 
gggtctgatg 
ggcggtcgtg 
gtgactgtgg 
gtctgatgtg 
tgateggtea 
gtggtgactg 
gtcgtggggt 
ggtgactgtg 
cgtggggtct 
gtggatggcg 
tggtgactgt 
cccgggggtc 
tggatggcag 
gtgtggtgac 
gtggggtctg 
gatggcggtc 
gatgtgtggt 
gtcgtggggt 



gtgggatagg 
gegagcttte 
atcttccttt 
gctccctgtt 
gtgggcaggg 
ccgtccttgg 
tttttttttt 
atct tggctc 
caagtagctg 
agacgaggtt 
ccacc tegge 
tcgcttcctg 
aacctccgtt 
cagaagagt t 
ttatcgatgg 
tgaaggaaaa 
tgccgttttc 
ggggcctgtt 
tcccgcctga 
caccctactg 
atgcctgatg 
tcctgtggaa 
tgtgctaaag 
tgagtccaga 
atttgtgggc 
gttactgect 
ggccctgccg 
gtcttttgat 
ctgctcacat 
gtcctggggg 
actc ttctcc 
ctggagaggc 
caegggggtc 
gtgactgtgg 
ctgatgtgtg 
gtcgtggggt 
ggatggcagt 
tggtgactgt 
gggtctgatg 
atggcggtcg 
gtgactgtgg 
caggggtctg 
tggatggtga 
ctgatgtgtg 
gatggegate 
gatgtgtggt 
gtcgtggggt 
ggatggcggt 
tgatgtgtgg 
tcgtggggtc 
tgtggatggc 
atgtgtggtg 
gtggggtctg 
gactgtggat 
ctgatgtggt 



tggggatctg 
ttcctgtagt 
gaactatggt 
ctgtttcttc 
ct tccaggcc 
aattcccctg 
tgataacaga 
actgcaacct 
gaattatagg 
tctccatgtt 
c tcccaaagt 
cagcttccgt 
ttccttctcc 
tcacgtgtgc 
cctccttcca 
gtttcgatta 
cctctaaagc 
aggaaccegg 
gccccgcccc 
agaac tgtgc 
atctgaggtg 
aaategtett 
acc tgettea 
ataattaegg 
gtgttgcctg 
tccaggttgg 
ccagctcctg 
gcctcacaag 
cctgtct tgg 
agtctgeaga 
tgcctgtgct 
ccgggagctc 
tgatgtgtgg 
atggcggtcg 
gtgactgtgg 
ctgatgtggt 
cgtggggtct 
ggatggcagt 
tgtggtgact 
tggggtctga 
atggcggtcg 
atgtgtggtg 
teggtcacag 
gtgactgtgg 
ggtcacaggg 
gactgtggat 
ctgatgtggt 
cgtggggtct 
tgactgtgga 
tgatgtgtgg 
ggtcgtgggg 
actgtggatg 
atgtgtggtg 
ggcggtcgtg 
gactgtggat 



tgggattggt 
gggtctgcag 
egggtttata 
cactctgggg 
tccttgtgtt 
cgagttggag 
gtctcgctct 
gtgcttcctg 
cgcccaccac 
ggccaggctg 
gctgggatga 
gagatctgea 
aggtctcget 
tgatttcccg 
tttcctttag 
tggatgtttg 
agggatcccg 
cgcacagcgg 
tctcagatca 
gtgagagggg 
gaaccgtttg 
ccacgaaacc 
geagee tc tc 
atttctgtga 
ctcctgggt t 
ttctcagggt 

ggggctgggg 

ctcgaggcct 
ggacgcaggg 
ataggaggtg 
gtggctgcac 
gagtgecac t 
tgactgtgga 
tggggtctga 
atggcggtcg 
gactgtggat 
gatgtgtggt 
cgtggggtct 
gtggatggcg 
tgtgtggtga 
tggggtctga 
actgtggatg 
gggtctgatg 
atggcggt tg 
gtctgatgtg 
ggcggtcgtg 
gactgtggat 
gatgtgtggt 
tggcggtcgt 
tgactgtgga 
tctgatgtgt 
gcggtcgtgg 
actgtggatg 
gggtctgatg 
ggcggtcgtg 



ttttatgagt 
gtgctccaac 
gtaagtcagg 
tcgtgtggtg 
cattggcctg 
gctttctttc 
tttttgecca 
agttcaagca 
catgetgact 
gtctcgaact 
caggtgtgaa 
gcgatagctg 
aggggtcttt 
gctgtttcct 
gctttgttta 
aactttcttt 
aggcccctgg 
gaggctaggt 
gcagtggcat 
tctagattct 
c tcccaaaac 
agtccctggt 
gtcagtgttg 
tgctttccgc 
gggaagggtg 
tgaategtae 
aacatgetga 
cctgtgtccg 
gcttagcagg 
ggggtgccgg 
ctgcatccct 
tgtgccacgt 
tggcggttgg 
tgtggtgact 
tggggtctga 
ggcggtcgtg 
gactgtggat 
gatgtgtggt 
gtcgtggggt 
ctgtggatgg 
tgtgtggtga 
gcggtcgtgg 
tgtggtgact 
gtcccggggg 
tggtgactgt 
gggtctgatg 
ggcggtcgtg 
gactgtggat 
ggggtctgat 
tggcggtcgt 
ggtgactgtg 
ggtc tgatgt 
gtgatcggtc 
tgtggtgact 
gggtctgatg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 
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tgtggtgact 
gtcacagggg 
actgtggatg 
tgatgtgtgg 
tcggtcacag 
actttgcgtc 
tgggcttcat 
agtgcccagc 
ag 



gtggatggcg 
tctgatgtgt 
gcggtcgtgg 
tgactgtgga 
gggtctgatg 
ctcggccccc 
cccgccatcg 
tctggccggg 



gtcgtagggt 
ggtgactgtg 
ggtctgatgt 
tggcggtcgt 
tgtggtagct 
cggcccccgt 
ggcttggccg 
gcaggccaca 



ctgatgtgtg 
gatggcggtc 
gtggtgactg 
ggggtctgat 
gcaggtggag 
ttcccaaaca 
caggtccaca 
tttgtggctc 



gtgactgtgg 
gtggggtctg 
tggatggcgg 
gtggtgactg 
tcccaggtgt 
gaagc ttccc 
cgtcc tgatc 
atgccctctc 



atggcagtcg 
atgtgtggtg 
tcgtggggtc 
tggatggtga 
gtctgtagct 
aggcgctctc 
ggaagaaaca 
ctctgccggc 



3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3782 



<210> 11 
<211> 980 
<212> DNA 
<213> Homo 



sapiens 



<400> 11 

gtctgggcac 

aatcactggg 

tgtgatgggg 

tgcgacagct 

ccgcagtgcc 

caaaggaaag 

agctggcccc 

cgggccgtgt 

ggccctgtgg 

cctgttagca 

ctctgggcga 

ccagaccctg 

cggtgggctg 

gtgtcacaca 

gatggccctg 

cggagggtc t 

tctcccgtct 



tgccctgcag 
ctcatgaccg 
gcatgatgag 
gctgcattca 
tttgttcatg 
gtgtccccct 
tcagtgctgg 
ttgagccacg 
ccctttgcag 
cttgctcggc 
atttccttgg 
tgcccggcag 
tgtgggtgtg 
etc tgcctaa 
cattccagcc 
tggccacgtg 
getttegcag 



ggttgggcac 
gacagactgt 
ctgtgtgcct 
ggcacctgct 
att tgctaaa 
cctttaggag 
gtctgaggee 
ccccgctgag 
atgtggtctg 
tctaggggac 
ctcccagggt 
ctgggcagca 
agcccagctg 
gcccatgtgt 
cagccccgca 
gtcctgcctg 



ggactcccag 
tggccctggg 
tggegaaate 
caegtttgae 
tgtcttctct 
ggcaggecat 
aaaggaaacg 
cgggcctctc 
tccacgtggc 
agtcgtgtcc 
gggggtggag 
actcctggat 
gacccacagg 
gtc tgcagag 
cttcatcaca 
tctcagcacc 



cagtgggtcc 
gggcagtggg 
tgagctgggc 
tgcgcggcct 
gccagttttg 
gtttgagccg 
tgtccccctt 
agtgctgggt 
cctgtggctc 
acegcatgag 
gtggcctggg 
cacatatgcc 
tggeccagag 
actcggcccg 
aacactgacc 
caccggctca 



tcccctgggc 
gggaatgagc 
catgccaggc 
ctctccagtt 
atcttgaggc 
tgtcctgccc 
cttaggagga 
ctgtccacgt 
tttgcagatg 
gctcagagac 
ctgctgggac 
atccgggcca 
gagaegttet 
gccagcccac 
ccaaaaggga 
ctcccatgtg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

980 



<210> 12 
<211> 2485 
<212> DNA 
<213> Homo 



sapiens 



<400> 12 

gtgagtcagg 

tgctcacctc 

ttttctggcc 

ggcteggett 

ggggtgtgga 
tgcgccgagc 
cggctctcac 
tcctgtccct 
cccatctgga 
ccatggggca 
cggtggtaga 
cacacctccc 
ggaggaaat t 
ggaggectet 
tttatttaaa 
at tataatat 
cacaaattgc 
taagcggccc 
egggee tec t 
gtggcagggc 
ggtgactgtg 
tggtccagt t 
acagagagag 
gggctggccg 
gcccatcact 
ttttttgaga 
actcactgca 
gctgagatta 
gggttt ttgc 
ctcggccccc 
ct t t t taagg 
getgeaageg 
tcgcgtggca 
actgt t tgtc 
gtcatgctga 
catgagtc tt 
teattceggg 
aaagaaaacc 



tggccaggtg 
tctcctgccc 
cccgccccct 
gcggcagccg 
gttgctcctg 
gtttgagect 
aegcttgtat 
gtcgtgtgac 
aagtgcgggg 
ggcggcctgg 
gccacagtgc 
ggcaggcatc 
cgtgcacact 
etc tgggatc 
aatataacta 
ttat taaagt 
acatggcagc 
ccaggcccac 
tcgtggtcgt 
tttggggaat 
tc tgtcc tgt 
tggcctctga 
tttcccatcc 
gactcctaga 
gtgatatctg 
eggaaegtea 
acctccgcct 
caggcaccca 
catgttggcc 
caaagtgctg 
tgaccaccta 
tc tc t tagca 
gccatgcc t t 
tgaaaacgea 
aactaggggc 
tcaccgtgga 
tcaagtgtc t 
t tga tga t tc 



ccattgccct 
cttccccact 
ccggctcctg 
gageggagea 
cgtggaggac 
gcagcttgtc 
ctctctctcc 
ccccgcgagg 
ttgaccgtgt 
gagagctgee 
ctggtgccac 
tgcctgcgac 
caaggtcatc 
gtctccagcg 
ttaattattg 
ataattagaa 
agagtgaat t 
agaattege t 
gaattttatt 
gtgaggtgat 
ccc taggaca 
ataaaaacgt 
catgtgctca 
gttggtgcgt 
caccagcaag 
ctgttgtctg 
cccgggt tec 
ccccctgcgc 
aggctggtc t 
ggattacagg 
tagege ttcc 
acaggagtgg 
ctg tgtgcac 
ccc t tggcat 
aaggt tgtat 
caaat tcctt 
ggt tctgtga 
agagcaagga 



gcgggtggct 
gnccttc tgc 
ggctgeagge 
ggtgccacac 
gaggggeggg 
agctccaagt 
cgatacaaaa 
gcgcgggctc 
agtttgctcc 
gtcacacagc 
atcacgtcct 
cctgtgtgtg 
agcaaggtca 
gataaaggac 
cattataagt 
atattaagta 
ttggccgagg 
gacaaagtca 
aagatggatc 
gactgcgtcc 
cggacaggcc 
ct tcaaaacc 
caggggegta 
gtgette tgt 
gaaagectet 
cctgggc ttg 
agcatttctc 
ctggctaatt 
cgaactcctg 
tgtgagccat 
cgaaaataac 
cgtcctgtgg 
ct ttaggttc 
ccttgtttgg 
ccgt tggege 
gaaaaaaaaa 
a taaact eta 
tgtggtcaca 



gggcgggctg 
ccggggccac 
tcccgaggcc 
gaggcctgga 
gggtgtgtct 
tactactgac 
ggattttatc 
ttctctctgt 
tetegggggg 
cactgggtga 
ctggatttta 
cc tggggaga 
tccgcagtca 
tgtgcacagc 
aatcactaat 
gtacacacgt 
gacacgtgtg 
cc tccccaga 
aagtcacgta 
tcatgccc tg 
cgaagctc ta 
tgt tgcccca 
tc tgcttgcg 
gcaaaaagtg 
tttcttttct 
agtgcagtgg 
ctgcctcagc 
tttgtatttt 
acctcaggtg 
cacgcccagc 
aggtcttgtt 
gc tctgggga 
caeggggcta 
agagtttctg 
geageggcta 
aaaggagtcc 
agatt taaga 
cc tgtggctg 



gcagggcttc 
cagagtctcc 
ceggaaacat 
aatggcaagc 
gggtcaggtg 
gctggacacc 
cgattctcat 
gactagattt 
cctgtggtgg 
gccacactca 
agtaaaacca 
gtggtagcac 
ggtggaacgt 
tteggaaget 
ggtatcagca 
tctggaaaaa 
cacatgtgtg 
gaagccacca 
ccgtccacgt 
acagacagga 
gtccccatcg 
aaaactaaga 
ttgactcget 
cag tcctct t 
ttcttttttt 
cgcgatctca 
ctcccgagca 
tagtagagag 
atccacccac 
eggaaagect 
tttgeagtag 
tggctgaggg 
ttctgctctc 
cttctcgttg 
catgtagggt 
ggt taagcat 
aaccttaatg 
gatctgtttc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 
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agccgcccca gtgcatggtg agagtgggga gcagggattg tttgttcaga ggtctcatct 2340 
ggtatgtttc tgaggtgttt gccggctgaa tggtagacgt gtcgtt tgtg tgtatgaggt 2400 
tctgtgtctg tgtgtggctc ggtttgagtg tacgcatgtc cagcacatgc cctgcccgtc 2460 
tctcacctgt gtcttcccgc cccag 2485 

<210> 13 

<211> 1984 

<212> DNA 

<213> Homo sapiens 



<400> 13 

gtgaggcctc 

agtgttaata 

ggttgcagcc 

gggctccacg 

gagggccgct 

ctgtcacgtc 

gccccacatc 

ttgcccatcc 

aacctaatgt 

tactttaagt 

gccatgttgg 

tatccctccc 

tgtccaagtg 

ttctttcctt 

aaggacatga 

attttcttaa 

gtgaatagtg 

tcctt tgggt 

tccttgagga 

cagtgtaaaa 

tagtatcact 

cagccatttc 

attatagcat 

ataagtttat 

ggaaagtgtc 

ccatggtcat 

cctgtggata 

gggcaccacg 

ttatttttcc 

gctggcacag 

tgctgtagca 

aaccgctt tg 

tcgtagacag 

ttag 



ctcttcccca 
ttcctggtgc 
ccttcttggt 
caggctctgt 
gccctgcatg 
acccaggttc 
tcccagcagg 
cacttgcatg 
ggttcaactc 
tctagggtac 
tgtgctgcac 
cactcccccc 
ttctcattgt 
gcaatagttt 
actcatcctt 
tccagtctat 
ccgcaataaa 
atatacccag 
atcaccacac 
gtgttctggt 
gaacaagcag 
tctcgaagac 
gtacagtgga 
gtaacagaaa 
ctcgagctgg 

ggggcgctgg 

ggatctgggt 
tggctcagag 
ctaagagtct 
aattgcacaa 
gttaactgta 
gagaatgtta 
atactacgta 



ggggggcttg 

tctggagacc 
atgaagccgc 
ccagcggcca 
atgagcatgt 
cgttagggtc 
ccctcgacag 
gggtctacac 
agctggcttt 
atgtgcacga 
ccattaactc 
atcccatgac 
tcagttccca 
gctcagagtg 
ttttatgact 
catcgatgga 
catacgtgtg 
taatgggatg 
tgtcttccac 
gctggagagg 
acagttagtg 
tccgggtttt 
tcaaggttct 
caaaaatttc 
cggcacactg 
gcttgggcct. 
ctcggatcat 
ggggcgaggt 
gagaagtggg 
gctgatggta 
gagagctcgt 
ctttatttat 
aaaagtgtaa 



ggtgggggtt 
atgactgctc 
acgggagggg 
tgtccagagg 
gaattcaaca 
cttggggaga 
gtggcctgga 
ccaaggacgc 
tattgacagc 
cgtgcaggtt 
atcatttaca 
aggccctggt 
cctgtgagtg 
atggtttcca 
gcatagtatt 
catttgggtt 
catgtgtctt 
gctgggtcaa 
aatggt tgaa 
atgtggacag 
aaggatgcgt 
tcctgtgcat 
tcttcattaa 
ttgtacacac 
gtcagccctc 
gagggtcaca 
gctgaggacc 
tcccagcccc 
gccgcgcctg 
aacactgagt 
ctgttggaaa 
ggctgtgtaa 
agttaacctt 



gatttgcttt 
tgtcttgagg 
ttgcacagcc 
cctcagggct 
ccgaggaagc 
tggggctggt 
ctgggcgcct 
acacacctaa 
agttactttt 
agttacatat 
ttaggtatat 
gtgtgatgtt 
agaacatgtg 
gcttcgtcca 
ccgtggtgta 
ggttgcaagt 
tatagcagca 
atggtatttc 
ctagtttaca 
cagttatttt 
caggaagcct 
cttt tgaaac 
ggttcaagtt 
aacttgctct 
tgggacagga 
cagtgcacca 
acagctgcca 
agctttctta 
atggccttcg 
acttataatg 
gaaatttaag 
attgtttgac 
gctgtgtatt 



tgatgcattc 
aaccagacaa 
tgaggactgc 
cagcaggcgg 
acaccagctt 
gcagcctgag 
cttcagccca 
atatcgtgcc 
ttttttttaa 
gtatacatgt 
ctcctaatgc 
ccccaccctg 
gtgtttggtt 
tgtccctaca 
tatgtgccac 
ctttgctact 
tgatttataa 
tagttc taga 
ctcccaccaa 
tttatgaaaa 
gcaggccaca 
tctagc tcca 
c tagattgaa 
gggatttgga 
tacctctggc 
tgcccagctt 
tgctggtaaa 
ccgtct tcag 
ttcgtcttca 
aatgaggaat 
tttttcattt 
attcagtccc 
ttcccttatt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

1984 



<210> 14 
<211> 1871 
<212> DNA 
<213> Homo 



sapiens 



<400> 14 

gtgaggcccg 

cccccgtgtc 

gtgctggatc 

caccttcggg 

acacacgtgg 

ggcggctcct 

ggcggcagcc 

gctggcccac 

agaatttgga 

cagagttgat 

gggat tgtcc 

agtgcgattt 

tggccgccag 

cccgccc tec 

gcctggcgtt 

aagat tcact 

gctgagggga 

ggecagggag 

gatgagtegg 

ggagc tgege 

ggggccacag 

gggcacaggg 

ctcgacgtga 

gagcaggaac 

aggagaaaac 

tcttgtccag 



tgccgtgtgt 
ctgcccc tgg 
egcaagagea 
agggagtggg 
tgagtgcagg 
ggggccccag 
tcctccccag 
agegttcget 
tttgctgagt 
ttttgtgaat 
aatgtggtcc 
gacgagggac 
gggtggt ttc 
aag tccaccc 
cc t tgtgccg 
eggggggage 
cagagcagac 
gtggctcaga 
cagccatgta 
agetggcega 
cagaggccgc 
gggctccctg 
age tgacgac 
tcagaaccct 
aggcaaagtc 
attttagtct 



ctgtggggac 
caccgcagcg 
gaggegcttg 
taccgtgcag 
cggtgacctg 
tgagaccccc 
ggtgcacctg 
gcggtcacgt 
getgetgtet 
caaactaaaa 
ccc tcaaggg 
gagaaacct t 
aggtget t tg 
tccaggtcca 
cagcccggag 
ccaggtccca 
ggggaacget 
gtgtatgttg 
acaggaaggg 
ggtcccaggg 
aggaagggaa 
agctgggtga 
tggtgttgcc 
cccct ttgtc 
gttgagaaac 
gccccggacc 



ctccacagcc 
ttgtctctgc 
gccgtgcacc 
gccctggtcc 
gctcctgctg 
aggagctgtg 
agectgegga 
tcctgcgtgg 
tgaaccaegg 
tcaggcacag 
cgccccacag 
gaaagctgta 
c tgggc tg tg 
ccctccaggg 
cacagcaggc 
agcaac tgag 
gcttc tgtgt 
gggtcccacc 
gtggccacag 
ccaggccaca 
ggggatgece 
gegagge tea 
cage tcacag 
taaagcacag 
gtc t taaaag 
acagatgagt 



tgtgggcttt 
caagtcctct 
caggcctggg 
tgcagagacg 
ctctttggaa 
cacagggcct 
gagcaggagc 
ggttgtt tgg 
agatggctag 
gggacctggc 
agccggtggg 
aagggaaccc 
tttgtgaaaa 
ccgccctggg 
tgtgcacatt 
ggc tcaggag 
ggcaagt tec 

gggggcagaa 

ggagctggga 
ggaagggcag 
aggecagage 
tgactcggcg 
cccagccagg 
cagatgee t t 
aaggtgggat 
etataaeggg 



gcagttgagc 
ctctctgccg 
ggcgcagggg 
cacccaggtt 
agtcaagagt 
geagggcega 
tgctgagtga 
gatcggtggg 
gagtgggttt 
ctcagcacag 
cttgttttaa 
tcagaaaatg 
cccatt tgga 
ctgggggtat 
taaatccac t 
tec tgaggct 
tgagggtgct 
ct c tgtc tc t 
atgeaccagg 
ggggacgccc 
agaggc tacc 
agggaacc tc 
tcccgcgcc t 
cagggcatct 
ggtggcaatt 
at tgtgg tgt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 
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tgccatgggg acacatgaga tggaccatca cagaggccac tggggctgca cctcccatct 1620 
gagtcctggc tgtcccgggt ccaggccagg ttcttgcatg ctcacctacc tgtcctgccc 1680 
gggagacagg gaaagcaccc cgaagtctgg agcagggctg ggtccaggct cctcagagct 1740 
cctgccaggc ccagcaccct gctccaaatc accacttctc tggggttttc caaagcattt 1800 
aacaagggtg tcaggttacc tcctgggtga cggccccgca tcctggggct gacattgccc 1860 
ctctgcctta g 1871 

<210> 15 

<211> 3801 

<212> DNA 

<213> Homo sapiens 



<400> 15 

gtgagcgcac 

ccgttgcgtc 

ggccacaggg 

tgagggtgct 

cagacctggg 

gggccctgct 

cctctgaacc 

cagggccctt 

agtctacagg 

tgtggggggg 

gccccgtctc 

ctgtttcttt 

ataatcccag 

ccaacctaac 

cctggtggca 

aacccaggag 

acagagtgag 

ggacaggtgt 

gaactggggg 

tggttgttaa 

tggactttgc 

tggacaccct 

ggtgcagaca 

ggatgccggt 

ctggcctcca 

cgcctctccc 

cttattttgc 

gcacagcatc 

gttctctcta 

catcagatgt 

gtgtgcttgc 

gggtgaactc 

gaagaaaaca 
ttcttgtcca 
gatggacaga 
gtatgtggca 
gactggaagc 
aggagacaca 
aggtgaacgt 
tgaggcaacg 
atggccattc 
cacattcatc 
aggggagcag 

ctcctgctcc 
agggtgagcc 
gatgcacact 
gagacccatc 
atgctggctc 
acttt tctgg 
taat tccac t 
ccaggcaggg 
tat tatgcat 
gtt taatggc 
cacaccccag 
gggcagtgag 
ctggagacac 
gctcttccat 
ccacaaaaaa 
acagtttatt 
tgcacaaaca 
gagt ttggtc 
cagccccctc 
aggctctttg 
acattccccc 



ctggccggaa 
cacctctgct 
tgcccctcgt 
cacaacggga 
tgcactgagg 
gggcgtgagt 
cgagaccctg 
ttgggcgtga 
atgccatgag 
gtctctacaa 
aggctcagac 
tatgaataaa 
cact ttggga 
caacatagtg 
cacgcctgta 
gcagaggttg 
acttcatctt 
ttttttattc 
tgccttcctc 
accagaggtt 
ctctttccag 
cgtgatgggg 
cccttgtgca 
ctcctgtgct 
ctggctttgt 
aggcacctct 
tccccatgaa 
agtgaatgt t 
aacacattgc 
gggtccaatg 
agaggtggct 
acatcctctg 
ggcaaaatga 
gattttagtc 
acaatagaac 
cagctgatgg 
aaataagttg 
tgcaaacaac 
tccctggttt 
ggcattgctt 
cctggagcgt 
ctctcacttt 
ccgcccttgg 
ggggcccttg 
ggagcccaag 
tgggaaggtc 
cc tcaaagaa 
c ttttctggg 
aaagcagc t t 
tc tgaagtga 
ggacttgcca 
cacaaaact t 
acaaaacgtt 
gagcctgccg 
tggtggtgag 
catgtgtgcc 
ccctgagatt 
cctgagtcac 
atgtgttttt 
cggccgtgcg 
atgcagagtc 
gggc tgcagc 
gagcaagc t t 
c tgtgtctca 



gtggagcctg 
tccgtgtggg 
cccatctggg 
gcagttttct 
tgtcttcaga 
ctctcaaacc 
gggccctgct 
gtctctccgc 
ttcatgatca 
aattctgggg 
acaaatgaat 
aagtatcaac 
ggccgaggtg 
aaattccatt 
gtccccgcta 
cagtgagccg 
aaaaaaaaaa 
tgtccttcga 
tgaaaggcac 
taaactgggg 
aatgctccct 
gagcagcagg 
tggtgcccag 
ccccacagtc 
ctgcatgatt 
gcagtgctgg 
atgtattttt 
attgaaggac 
aaagccacag 
ccagaatat t 
ctaaaagctc 
tgtctgaagt 
t taagaaaag 
tcccaaacca 
aaaacggaag 
aaaagagagt 
tgtctttaca 
accagcaaca 
ggtgttgggg 
tcactgcaga 
ttgtgcacgt 
gttctcctaa 
tcacccagct 
ctctgcccga 
gtcgtgttgg 
ctaccagcag 
acgcacgtga 
cttgccaaga 
gtt tgcatgg 
ccagacat ta 
cagcaagtca 
gctctgccat 
tat ttcaatg 
tgaatgtcat 
gccctggagg 
acgtgcactc 
caaacacagt 
acc tgtgttc 
ggctgagtta 
aggtttggat 
tggatggcat 
gcatgcccca 
tgcaggaggg 

g 



tgcccggctg 
gcaggcgact 
gctgagcaga 
gtgctatttt 
aagcagtctg 
cgaacacagg 
gggcgtgagt 
tgtgagcccc 
cgtgtgaccc 
tcttgtttcc 
tgaagatgga 
attccaggca 
ggtggatcac 
tctacttaaa 
tgcgggaggc 
agatcacacc 
aaaaaagtat 
taatatt tac 
accttcatgg 
tcctgtcgtt 
ggggtttgct 
tgcagacgcc 
catgtccctg 
cctgcttccc 
tccacatttc 
ccataccagt 
taggacaggc 
aaaggacaga 
aggctagtgc 
ctgtgctccc 
agcagtggag 
atacagcaga 
tgaaaaagga 
cagctcagat 
ccctatctct 
gtgtgtgtaa 
gcatatacca 
gaaataaaac 
aaggacacac 
gaaactcagc 
gatttattta 
ccacctgaga 
ggcaaagggc 
ggaccccaca 
ggatggctgt 
cgtcaaagaa 
aac tgatggc 
gccagcatca 
aagtcctcac 
tcacgggtc t 
cgaacctgcc 
taaacatttt 
tagcagtgtt 
gtgtgttcat 
acatcggtgg 
actggagccc 
gagattcccc 
actcgaggga 
tgtgcagatc 
acactcaaca 
gtagcatttg 
ggcaggacaa 
ggctgggtgt 



gggcaggtgc 
gccaatccca 
aatgcatctt 
ggtaaaagga 
gatccgaacc 
ggccctgctg 
ctctccgaac 
acactccaag 
atcaggggac 
ccagagcccg 
cacagatgca 
gggcaaggtg 
t tgaggccag 
aaatacaaaa 
tgaggcagga 
actgcactcc 
cagcattcca 
tggtgctgtg 
gaagagaaat 
ctgagttaac 
tcatggggga 
ctcatgatgg 
ttgcagctcc 
tctcacagcc 
ctgggctccc 
cagctgtgaa 
acccctggtt 
caaacaaatc 
aggatgggtg 
aaaggccact 
gcagtggttc 
ggcttgaagg 
aaagtggtaa 
ggtagaatgt 
cagaaacgtg 
tttttttttc 
gagcagat tc 
aaaagactca 
agggaggcgg 
ttgcctgagc 
aggcgccctg 
ggtagaggag 
atgcatgatt 
caagtcagac 
gaaagaagaa 
atgcatgtga 
gagacctgtc 
ggttgaggca 
aatgtcctgt 
tat ttaccat 
caaatacagg 
tcaaagaatt 
caaagctgga 
ctttggacat 
gatgcctcca 
tgtttagctg 
acgcccaact 
cgcccgggag 
tcatcagggc 
tcactagcca 
gagtccatgg 
ggaagcggga 
ggggcaggca 



tgctgcaggg 
aagggtcaga 
tctgtgggag 
aatggtgcac 
caagacgccc 
ggcatgagtc 
ccagagactt 
gctcatccac 
agggccatgg 
agagctcaag 
gaaatctgtg 
gctcacacct 
gagtt tgagg 
attagcctgg 
gaatcat t tg 
agcctgggca 
aaaccatagt 
c tagaggccg 
aagtggtgaa 
agtccagatc 
gcagcaggtg 
gggagtggca 
ctccccacaa 
ttacctggtc 
agcacctctt 
ctgtccactg 
ccagcctctg 
aggaaaatgg 
ggcatcaggt 
tggtcagagt 
gccatactca 
gcatctggga 
gatgggaat t 
ggtcagaact 
tgttaatgtg 
tgagaaaact 
taggtagaag 
aagggaaggg 
atgaaaccag 
cacagtgaaa 
tgaggtcctg 
gaaaggctcc 
gcagcctggc 
ccataggctc 
atggacgtct 
aactgacagc 
cccatccctc 
agctggaaag 
gtcttcccag 
ttccagtgtt 
gctaaggaga 
tt tgaagaat 
tgtaaaagaa 
ggacatacat 
tcctgcccc t 
gtgccacctg 
cagtgttctc 
ccagggctcc 
agatgatgag 
ggtcc tggtg 
agtgagcacc 
ggaaggcagg 
cc tgtgtctg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

£60 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3801 
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<210> 16 
<211> 880 
<212> DNA 
<213> Homo 

<400> 16 

gtgagcaggc 

gtgtgtgtgt 

acatgtacgc 

cagtgtgtgc 

tgcatgtgtg 

gcagtgtgag 

taggtcctca 

gctgaggctc 

ccctctcctg 

actccc tctc 

cctcccctct 

gagcctcggg 

ccgggtgagg 

ctctttcttg 

gaggtttcta 

<210> 17 
<211> 3186 
<212> DNA 
<213> Homo 

<400> 17 

gtgagccgcc 

tctgacccgg 

ttcagtggtg 

tctccatctc 

gcactggccg 

atgtttatgg 

tcagatgccc 

tcgctggccc 

gcacc tgtgc 

ggtgcccctg 

acagggccag 

aaagggcagt 

ggagctgaat 

atttgtgtta 

gtcgtcgtct 

gtaccatgaa 

agagccacag 

ctcagttcca 

aaatcttccc 

ctcctt tccg 

ctgacactgt 

tgaggtcaga 

tgccacctgc 

age tccgagg 

ccatgtgggg 

tgggecaegg 

ccc tcgaggc 
cagtgggcga 

tggggctegg 

cccgacctct 
tgtttccctc 
atttgtttaa 
cacctcagca 
aggtcattcc 
gtgggtgaga 
ctcaggcac t 
aeggggge tc 
ccaggtctgt 
gacagggc tg 
aggatccct t 
tacgaaaaat 
ttgggaggct 
gattgeacca 
agaagactga 
cagaagecaa 
ccccagaccc 
ttgatatacg 
tgg tacacaa 
aatcagaagc 
gtgttcatac 



sapiens 



tgatggtcag 
gtgcgcgcgt 
atatacacgt 
acaggtgtgc 
ttcgtgcaca 
tagcatgtgt 
gcaccagtgc 
tgaagctgea 
tgggcttctg 
tcctgtgggc 
ctgtgggcat 
ggcaggcaga 
gccaggccgg 
ttccatctga 
ccgtttctca 



cacagagttc 
gectgeaagg 
gagcacatac 
aagggcacaa 
gtcgtgtggg 
gcacataaca 
cactccttac 
gecctgaggg 
tgtccactcc 
atccgcgtcc 
ttgcgtccac 
tgacacagag 
atttcactgg 
atggatgata 
ctctttcttg 



agagttcagg 
ctgatggtga 
atgtgtgcat 
gtgtgtgcac 
cattcaegtg 
tgtattgagg 
aggatgagac 
cattgtccca 
ccctctcctg 
actccccctc 
tccctctcct 
tcttgactcg 
gaagagggat 
aagcaaaaag 
gcgactctag 



aggtgtgtgc 
ctggctgcac 
gtgtgtacat 
atgcgaatgc 
aggtgcatgc 
ggtcctcgtg 

ggggtcccag 

tctgggcatc 
tgggcattta 
tctgtgggca 
ggttccttcc 
cccagggtgg 
agtttcttgt 
taaaaactta 



gcaagtatgt 
gtaagagtgc 
gaaggcatgg 
acacctgaca 
gtgtgggtgt 
ttcaccccgc 
gccttggtgg 
cgcgtccact 
catccactcc 
tctgcgtcca 
tgtcttggcc 
ttcgcagctg 
caaaatgttc 
aaatcccaga 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

880 



sapiens 



accaaggggt 
ggcttcacct 
ctgctgcctg 
tgggtagtgg 
tgggaegtea 
ggagtcttag 
ggaggatttg 
cacccccggg 
tctgggcatg 
ccaagaatcg 
cttctgcctg 
cgggcaccac 
gecaggagge 
cccagggccg 
atcgtggaaa 
aatggttttt 
ctgcatgtta 
gggtgcgtcc 
tegtttgeat 
gaaacccttg 
ggttgacccc 
ggagttttcc 
tcctcccata 
agctcccgta 
acccttgggt 
geegt t tcca 
aggagtggga 
ggctgtggtg 
ccttcctggc 
agcaggtggc 
ctgggtcagg 
aaacattctg 
gagttactga 
agaagtggct 
tgaggtacac 
gggtgagatg 
tgatcacacg 
gcacacctgc 
gcgcggtggc 
gageccagga 
aaaaacaaaa 
gaagtgggag 
ctgtactgca 
caaatgeagt 
gtcggtgtct 
agggt t tatg 
atgacatcaa 
ggaacaatgg 
cagcatgggg 
agatggtgca 



gcaggcccag 
tggaactcct 
tgcacagt tc 
taggagcegg 
tggaggecat 
cagaggaggc 
gggtctcagc 
aaggtgcagc 
gctgtgctcc 
acaactttat 
gagtcagggc 
aggcccgggc 
cgaagccctc 
aggctgegeg 
cccagcaagg 
aacccgagtg 
ccgcctttgc 
ggctcagacc 
c tccctgacg 
gggtgtgctg 
agggtccagc 
caggtgaaaa 
ttcagctcag 
gagggectgg 
agtcgcttga 
aacacagagt 
gaaeggagag 
gtccacgtgg 
ccgtgc tggc 
tatttctccc 
agcgtggccg 
ggcctggctt 
gaggctgaaa 
caggaagtca 

ggggggctca 

aggtacaegg 
cacatatgag 
cccaaagtcc 
tcacacctgt 
gt t taagacc 
attagctgaa 
gatcact tga 
gcctgggtga 
ttcttggaaa 
cggtgtcagt 
caccacaggg 
ggttgtc tga 
ataaac tgga 
ggctggcatc 
cagaaacgea 



cctccaggga 
gggt tttagg 
tgttcgcgtg 
tgtggcccca 
cccagggcag 
tgggaaggtg 
aaagagggee 
agagctgtgg 
tggaacgttc 
cacagaggga 
aggtggtggc 
ctccacctca 
gccccatgag 
aat tacegtg 
getcaeggga 
ct tgegee tt 
accagctcca 
gccctcctct 
cgtgcctggg 
gatacaggtg 
tggcgtgctt 
ctcctgggaa 
tcttgtcctc 
gctcagggca 
t tgggtagee 
caggcaegtg 
ctgggccccg 
cgctgggggc 
cgcgcctcca 
tttggaagag 
tgtggcaacc 
ccgttgt tgc 
ccggggtgct 
gtgagaccag 
ggcagtgggt 
ggggctcagg 
cacatgtgca 
caggaagctg 
agtcccagca 
agee tgagca 
catggtggtg 
geccaggagg 
cagagtgaga 
gaaacat t ta 
gagatgagat 
gcgggtggct 
cgaagggcag 
aacc ttagag 
caggatggag 
gtgtacc tg t 



ccctccgcgc 
ggcaaggaat 
gctctgtgca 
ggtgtcccca 
caggggcatg 
tctgaacagt 
gaggtgggtg 
ctccccacac 
cctgtcctgg 
agggecaate 
acaagcctcg 
acaggcctcc 
ggctgagaag 
cacacttgat 
gagttttcca 
catgetctgg 
gaggcttggg 
ctgccttctc 
ccctcgtgca 
ccactgagga 

ggggcctcct 
actcccaggg 
atttccccac 
gggcggctga 
ctgaggaggc 
gaaggcccag 
atttcaegge 
ggggtctgat 

caegggcttg 
agcccctcac 
ccgggacct t 
taaatgggga 
ggcttgactg 
gtacatgggg 
gaggecaggt 
cagaggg tea 
catgtgctgt 
agaggecaaa 
c t ttgggagg 
acatagtaga 
tgcgcctgta 
tggaagctgc 
gcccatctca 
gtaggaactt 
gatgggtcct 
cagaagggat 
gattcatgat 
gccttcccgg 
ctgcttcagc 
gcacacacag 



tctgctcacc 
gtcttacgtt 
aagcacctgt 
ctgtgcctgt 
gggtaaagag 
agatgggaga 
caggtgaggg 
agcccggcca 
ctggtcaggg 
tgtggaggcc 
gggctgtacc 
cgagccactg 
gagtgtgagc 
gtgaaatgag 
ttacaaggtc 
cagggagggc 
accaggctgt 
tc tctgcctc 
agetgettga 
ctggaggtgt 
tgggccatga 
ccatgtgacc 
cagggtctct 
gt ttccccac 
egagatgega 
gaatcccctt 
agecagge tg 
tcaaatccgc 
gggtggacgc 
ecatgetagg 
aggcttattt 
aaagacatcc 
gtgtgatctc 
ggctcaggca 
acatgggggg 
gaccaggtac 
t tcatggtag 
gatggaggc t 
ccgaggegag 
accccatctc 
gttccaatac 
agtgagctga 
acaacaacaa 
aacctacaca 
cacaccatca 
gcgcaggacg 
aagtacctgc 
aacaggggc t 
ctccacatgc 
acacgcagct 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 
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actcgcacac acaagcacac acacagacat gcatgcatgc atccgtgtgt gtgcacctgt 3060 

gcccatgagg aaacccatgc atgtgcattc atgcacgcac acaggcaccg gtgggcccat 3120 

gcccacaccc acgagcaccg tctgattagg aggcctttcc tctgacgctg tccgccatcc 3180 

tctcag 3186 

<210> 18 
<211> 781 
<212> DNA 

<213> Homo sapiens 



<400> 18 

gtatgtgcag 

ggagactgag 

taacccaacc 

tggaagggac 

gcgctggggg 

tcctgtttgc 

gcaaacccag 

agcagaggcc 

tcctctgccc 

agccatgtcg 



gtgcctggcc 
tgaatctggg 
actgtcaggc 
aggagctgtc 
gcctggtctc 
cctgtggtgg 
gccaagggct 
gcgtatcacc 
ctggacactt 
aacctgcggt 



tcagtggcag 
cttaggaagt 
tcgtctgccc 
tgggagctgc 
tcctgtttgc 
gattgggctg 
taggaggagg 
acgacagagc 
tgtccagcat 
cctgagctta 



cagtgcctgc 
tcttacccct 
gccctctcgt 
catccttccc 
cccatggtgg 
tctcccgtcc 
ccaggcccag 
cccgcgccgt 
cagggaggtt 
acagcttcta 



ctgctggtgt 
t ttcgcatca 
ggggtgagca 
accttgctct 
gatttggggg 
atggcactta 
gctaccccac 
cctctgcttc 
tctgatccgt 
ctttctgttc 



tagtgtgtca 
ggaagtggtt 
gagcacctga 
gcctggggaa 
gcctggcctc 
gggcccttgt 
ccctctcagg 
ccagtcaccg 
ctgaaattca 
tttctgtgtt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 



gtggaaattt 
gggtcgggac 
gtgtctcctg 

g 



cacctggaga agccgaagaa 
agccagagat ggagccaccc 
ggaggggagc tgggctgggc 



aacatttctg tcgtgactcc tgcggtgctt 660 
cgcagaccgt cgggtgtggg cagctttccg 720 
ctgtgactcc tcagcctctg ttttccccca 780 

781 



<210> 19 
<211> 536 
<212> DNA 
<213> Homo 



sapiens 



<400> 19 

gcaagtgtgg 

tgtgtggggc 

ggggcctgga 

agcctggcag 

ccacgcttgg 

ctgccctgag 

ctgggctgcc 

ccagccaggg 

ttggaggatg 



gtggaggcca 
gagcagcctc 
gccacgctgg 
ggtccccaac 
gagccttctg 
ctcctggggt 
tgtctgctcg 
ccacgaggtg 
ccacctctgg 



gtgcgggccc 
agatgctgct 
cagccctatg 
ttcttgaacc 
acccctgacc 
cctgagcaag 
ccccggtgga 
caggccctgc 
cctcttctgg 



cacctgccca 
gaagtgcaga 
tgat taaacg 
cctgcttccc 
tgtgtcctct 
ttctctcccc 

ggggtgtctg 

ctgcccggcc 
aacggagtct 



ggggtcatcc 
cgcccccggg 
ctggtgtccc 
atctcagggg 
cacagcctct 
gccccgccgc 
tcccttcact 
acccacacgt 
gattttggcc 



t tgaacgccc 
cctgaccctg 
caggccacgg 
cgatggctcc 
tccctggctg 
tccagcgtca 
gaggttccca 
cctaggaggg 
ccgcag 



60 

120 

180 

240 

300 

360 

420 

480 

536 



<210> 20 
<211> 3179 
<212> DNA 
<213> Homo 



sapiens 



<400> 20 

atctcatgtt 

ctgtgagtga 

gtctatgagt 

tttctgatgc 

ccctggaaga 

cttgggcggc 

ctggggtggc 

ggctgggccc 

ggtgcacatc 

cctgtataaa 

ccgcctctgg 

gtggggcagt 

atcctcttat 

tcttcctctt 

tatcctctta 

gtggagctgg 

gaggggcggc 

gcttgggcca 

cc tggtgggg 

gt ttgagtgc 

gcgtcattta 

gggcccaagt 

age tg tgagg 

ggaagggagc 

agggegggga 

gaccaacagg 

tc teegggtg 

tagaccctta 

ttgtctgttt 



tgaatcctaa 
acggggtggt 
gaatggggtt 
tgtgaggcag 
cataacagta 
ggggatgatg 
aggggtgatg 
cctcctcccc 
ctctgggcca 
atccaggatt 
ccattctctt 
ggagggtgtg 
catctcccag 
atctcccagt 
tctcctagtc 
acatacgtcc 
tcagagggac 
cacgaaaccg 
ecttaeggta 
agcccggacg 
ttgctgctgc 
ccacagactg 
aaggaggggc 
ggccccgggc 
c ttcccagga 
teaggecatt 
ttttttgttg 
aaaaaggtat 
ttatttatta 



tgtgcac tgc 
ggtcagtgcg 
gtggtcagtg 
gaggggaagg 
agtccaggcc 
gagggectgg 
gggggggctg 
tgcctcccac 
tcagctt tea 
cctcctcccg 
aagagtagac 
gacacaggag 
tctcatct cc 
ctcatctgtc 
tcatccagac 
t tcctcaggc 
geagecttgg 
agggece tgc 
tggccgggnc 
tgcctgg tgt 
t tcagagaat 
tgtcgcaaat 
tc t tggcagc 
gccgtgggcg 
geagaggecg 
gt tcagc tat 
aaa t t t tac t 
ttgetttgat 
ttattat tat 



atagacacca 
ggcccatggc 
cgggcccatg 
agggtagggg 
cgaagggcag 
ccagggtggc 
gtctgggtgg 
ctgcagccgt 
tggaggtggg 
aacgccccaa 
caggat tc tg 
gcttcagggt 
catcctct ta 
atcctc ttac 
ttacctccca 
agaaggaact 
ggcgaagaaa 
gtgagtggct 
c tactgagtg 
cggggtgggg 
gtctgagtga 
gcactc tggt 
cggcctgggg 
gacgacct ca 
ctgc tcaggc 
ccatct tc ta 
caggat tact 
atggct taac 
ta t tagagat 



ctgtatgcaa 
ctggctgtgc 
gcctggctgg 
atagacagtg 
cagggatget 
agggatgatg 
eggggaagat 
ggatceggat 
gggcaggggc 
ctcaggttga 
atctctgaag 
ggggctggtg 
tcatctccca 
catctcccag 
gggcgggtgc 
ggaaggattg 
cagcccctcc 
ccagagcctt 
caccttggac 
gc t tatggee 
ccgagcc taa 
gec tggagee 
gegcett tgc 
agtgagaggt 
acacctgggt 
caaagctcca 
tatatttttt 
tcac taagca 
ggtgtc tac t 



t tacagaagc 
atttaeggaa 
gectgggagg 
ggagccccca 
gggggcccag 
ggggccccag 
ggggaagect 
gtgcttccct 
atgacaccat 
aagtcacat t 
ggtgggtagg 
atgctctctc 
gtctcatctg 
tctcatctct 
caggctcgca 
cagagaacag 
tcagaagttg 
ccagcaggtc 
agggcttctg 
ac tggatatg 
tgtgtatggt 
ccegtatagg 
cc tgcaaac t 
tggacagaac 
t tgaatcaca 
gat tec tgt t 
gc taaagta t 
cc tac tt tat 
c tgtcaccca 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 
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ggttgttagt 
gatcctccgg 
ctggcacttt 
cagtagtttg 
gggtaacata 
ccagcatctg 
ggtcatggct 
gaccc tgtct 
gaaggaaaga 
taggtagact 
aaccccagct 
gacaagcgtg 
ggaggctgtg 
gccccacgct 
acctgggaag 
gcacttgtgg 
acctgtgaat 
tgccagtcag 
gggtccctag 
cactggccac 
ccgctccatg 
acgcctcgat 
tgatgcgttt 
cagtacaggg 



gcagtggcac 
cctcagcttc 
taaaaaccac 
ggaagccgag 
gggagacccc 
tagtcccagc 
gcagtgagct 
caaaaaaaaa 
gaagaagaag 
gtcaaatctc 
ctttggactt 
tatggagcga 
ggtgacacca 
cctgccggtc 
gatgctgtgc 
caggcacaat 
gtgtcacccg 
ccgatcttaa 
aagtgagaga 
tgctggcttt 
ctggaaaagc 
ttcaggccag 
gtgttcagcc 
aaatgaatac 



agtcatggct 
ccagagtgct 
tatgtaaggt 
gcagaaggat 
atctctacaa 
tgctcgggag 
gtgattgtac 
aaaaaaaaaa 
gaagaaggaa 
agagcaaaat 
ccttaggcct 
gtgagttcaa 
gccaggaccc 
ctgcacctgc 
agggggcttg 
tacagcccct 
caaggcagag 
ggtcatcctg 
gggaggcagg 
gagatggagg 
aagcaatcct 
tgggacctgt 
actaagctgc 
agggacagtt 



cgctgtagcc 
gggattacag 
caggtccagt 
tgtctgaggc 
aaaatgcaaa 
gctgagtggg 
catcgcactc 
gaaggagaag 
gaaagaagga 
gaaaataaca 
gaacttcatc 
agcagaaagg 
ctgaaaggga 
tgtaaccgtc 
ccaaact ttg 
ccccaaagat 
gctggtgaag 
gattatctgg 
ggagagtcag 

agggggtccc 
ccccggtcct 
ttcagctttc 
agtgattcgt 
ctcagagtga 



gcaaaccccc 
gtgtgagcca 
ggcttccaca 
caggagtttg 
aagttatccg 
aggatcgctt 
cagcctgggc 
gagaagagaa 
gaaggaggcc 
aagttttaaa 
tcaagcagct 
gaggagaagc 
gtggttgttt 
gatgttggtg 
gtgggtttca 
gcccacgtcc 
gctgcaggtg 
tgggcctgat 
ag^ggggacg 
cagccaagga 
gagggcacac 
cggcctccag 
cacagcagca 
ctctcagccc 



aggctcaagt 
ctgcccttgc 
cctgtcatcc 
agaccagcat 
ggcgtggggt 
gagcccggga 
aacagagtga 
gaagaaggaa 
tgctaggtgc 
gggaaagaaa 
tec ttccaca 
aggcaagggt 
tcctgcctca 
ccaggtgccc 
gaagccccag 
ttctcctgga 
gaatcaegge 
atggccacaa 
tgagaaggac 
atgggggcag 
ggece tgece 
agctgtaaga 
aatggaatag 
acccctggg 



1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3179 



